HNSCC patient survival associate with anatomical locations and tumour stage
The samples from this study came from 84 chemotherapy and/or radiotherapy treated patients with tissues acquired from a range of sites across the mucosal head and neck regions (i.e. tongue, oral cavity, pharynx and lip). The clinical information available were analysed to assess the respective influence on treatment outcomes or patient survival. In all cases looking at either disease free survival (DFS) or overall survival (OS), the anatomical location of the sample as well as the broader tumour stages (i.e. grouping patients as either early (I and II) or advanced (III and IV) stages) was found to significantly associate with patient prognosis (based on Kaplan Meier survival analysis (Supplementary Fig. 1). For both DFS and OS, patients with samples from the tongue are associated with the worse survival outcomes while samples from the lip associates with the best survival outcomes (p < 0.0001 and p = 0.0013). Similarly, patients at the early stage of cancer (i.e. I and II) have a significantly better disease free (p = 0.0088) and overall survival (0.0013) outcomes than those at the advanced stages of cancer (i.e. III and IV). These findings are consistent for subsets of the patients with samples used in the respective assays (Supplementary Fig. 2).
Spatial multicellular proteomics analysis
The GeoMx IPA dataset consist of 138 AOIs (paired tumour-stroma regions) across 68 patient cores (each core with one ROI separated into tumour or stroma regions) with the expression matrices as integrated counts for 580 protein markers. The data were pre-processed using R package standR’s QC pipeline4 as described in methods. Briefly, correlations between the background controls (IgGs) are high but not so for the housekeeping markers (Fig. 1A). Assessing sample and protein quality, no protein marker nor any AOIs was removed based on assessment of the nuclei counts and library size (Fig. 1B) with no factor identified to be confounded with either. The counts data is normalised and corrected for systematic bias using the RLE scaling method (as shown by the centred Relative log expression (RLE) plots, Fig. 1C). PCA using the logCPM (log counts per million) highlights the factors contributing to the variances in the data (Fig. 1D). As expected, the segment type (i.e. tumour or stroma) accounts for most of the variation across PC1(23.25%) and PC2(13.45%). Interesting, PC3 (8.84%) separates the samples from the lips and tongues (Fig. 1D) middle column orange vs purple circles) while PC2 and PC3 separates the broader tumour stages (i.e. between Early and Advanced stages). Both these observations agree with that seen in the cohort survival analysis.
The difference in protein abundances between difference factors in the data was explored via differential expression analysis (Supplementary Table 1). We first compared between samples from different anatomical regions in either the Stroma or Tumour compartments. In the tumour compartments (Fig. 1E top left), EpCAM (associated with aggressive tongue cancer phenotype)5, Cytokeratin 19 and tumour marker CA9 are lower in the Lip compared to the Tongue. While in the stroma compartment (Fig. 1E top right) there appears to be lower expression of fibronectin, alpha smooth muscle actin (α-SMA), osteopontin and proteins associated with higher metastatic potential in the Lip6,7. When comparing tumour compartments from the Pharynx with those from the Tongue compartment (Fig. 1E bottom left), a higher expression of PD-L1 (associated with tongue cancers) was found to be in the Tongue and conversely higher expressions of histone modifications (associated with oral carcinogenesis and aggressive phenotype) were found in the Pharynx8,9. In the stroma compartment comparing Pharynx vs Tongue (Fig. 1E bottom right), lower expression of osteopontin, CA3 and CCR6 (associated with metastasis in head and neck cancers) are found in the Pharynx10,11.
The samples were then grouped into 3 survival groups based on their OS and DFS durations and DE between the groups within either compartments or anatomical locations were assessed. Limited DE proteins were found for most DFS comparisons (Supplementary Table 1), which may be due to confounding treatments and/or intent of treatments. On the other hand, for OS, within the stroma compartment (Fig. 2F bottom), higher expression of Wnt-related SFRP protein (inactivation of which is linked with oral carcinoma) was found in patients with better outcomes12. For the tumour compartment (Fig. 1F top), higher expression of interferon stimulated gene–15 (ISG15) protein is found in patients with poor outcomes, whereas high expression of CD44 was found in patients with improved survival. ISG15 has been found to be elevated in 80% of oral carcinomas13.
We then compared the OS groups between samples from different anatomical locations. In oral cavity (OC) samples (Fig. 2G left), there is a higher expression of ISG15 (known to be elevated in oral carcinoma)13, IFIT1 (promotes metastasis)14 proteins as well as histone modifications (associated with oral carcinogenesis and aggressive phenotype)15 in patients with poor outcomes. In Tongue samples (Fig. 2G middle), a higher expression of MMP8 (which inhibits cancer invasion and progression) protein was found in the patients with good outcomes16. In Pharynx samples (Fig. 1G right), the S100 proteins (including A8, A9 and A12) are found to be elevated in patients with poor outcomes. It is known that S100 proteins are associated with poor cancer prognosis with S100A9 linked to regulation of MMP717. These results suggest that, with the targeted protein panel afforded by the breadth of coverage from the GeoMx IPA assay, it is able allow the elucidation of known biology and markers in the TME of mucosal HNSCC.
Identification of proteomics features associated with patient survival
To assess how well each marker’s expression associates with patient’s overall survival or disease-free survival (DFS), we conducted a feature association analysis based on a univariate cox proportional hazards (CoxPH) regression model and Kaplan Meier (KM) survival analysis approach. In terms of DFS, 30 out of 68 eligible cases have an event while for OS, 21 out of the 68 cases have an event. The feature association analysis resulted in 61 proteins identified to (with CoxPH wald p-value < 0.01) associate with DFS outcomes based on stroma segments and survival group information. These proteins were split into either hazard ratio (HR) < 1 or HR > = 1 groups where KM survival analysis and long rank test was then performed on the groups stratified based on the expression levels of the upper and lower quartiles of each protein. 52 protein features were identified at a significance of p value < 0.05 (22 HR ≥ 1,30 HE < 1) (Fig. 2A). Using the similar approach, 84 proteins from the CoxPh results were filtered down to 74 proteins for DFS in the tumour segments with 51(44) and 73(64) proteins identified for OS in the stroma (tumour) segments (Fig. 2, Supplementary Tables 2, 3 and 4). Significant markers of relevance identified includes α-SMA, Interferon gamma (IFN-γ), FGF2 and Adenosine Receptor A2a (ADORA2A) associated with DFS in the Stroma while CD34, CD44, FoxP3 and CD3E are associated with DFS in the tumour segments. ADORA2A has been described to hinder anti-tumour immunity by suppressing immune cells such as T cells and thereby associated with poor outcomes18. This observation agrees in this study in terms of OS. Interestingly however, higher ADORA2A expression within the tumour was found to be associated with improved disease-free survival. Identified markers including BRCA1 and CXCR5 are associated with better OS19,20 while COL1A1 and SPP1 associates with poor survival and both of which have been cited to negatively impact patient outcomes by assisting with tumour invasion and progression6,21. IFN-γ and actin alpha 2 (ACTA2) are associated with better disease-free survival while CD34 and functional immune cell marker22, FGF2, are associated with poor disease-free survival23. IFN-γ is considered a protective serum cytokine that is involved in anti-tumour Th1 immune response and is often upregulated in HNSCC compared with healthy tissues24. Our results are consistent with previous studies that found a linear correlation between downregulation of IFN-γ associated with regional progression24. FGF2 associates with pro-tumorigenic phenotypes that are known to shift a tumour associated macrophages to an M2-like macrophage25.
Spatial transcriptomics analysis
The GeoMx WTA dataset consists of 122 AOIs across 61 patient cores (each core with one AOI separated into tumour or stroma regions) with the expression matrices as integrated counts for 18815 transcripts. The data were analysed using the same pipeline as the IPA data. Briefly, 139 negative probes while 4 AOIs were removed by the AOI QC (Supplementary Fig. 3A). The counts data was then RLE normalised (Supplementary Fig. 3B) with PCA of the logCPM generated to highlight the factors contributing to the variations in the data (Supplementary Fig. 3C). One AOI was assessed to be an outlier with extremely small normalization factor and consequently removed. From the PCA, the tumour and stroma segments separate along PC1(28.27%) and PC2(13.27%). However unlike in the IPA, none of the other factors clearly separates out in any of the other PCs.
As with the IPA, differential expression analysis was conducted to investigate the differences in transcript abundances between the same sets of factors for the WTA data (results in Supplementary Table 5). The resulting DE genes from the WTA are compared with the corresponding DE proteins in the IPA results with limited overlaps noted (Supplementary Table 6). Similarly, we compare samples from different anatomical regions in either the Stroma or Tumour compartments. In the tumour compartments (Supplementary Fig. 3D top left), SOX9 (associated with promoting nasopharyngeal carcinoma metastasis26) appears to be downregulated in the Lip compared to the Tongue. The protein expressed by this gene is also found to be DE in the IPA data. In the stroma compartment (Supplementary Fig. 3D top right), structural genes like FN1 (fibronectin), SPP1 (osteopontin)6, ACTA2 (α-SMA, smooth muscle actin), COL1A1 (Collagen I)21 and ITGA5 (Integrin alpha 5) are all downregulated in the Lip, in agreement with DE results from the IPA. Comparing tumour samples from the Pharynx with those from the Tongue compartment (Supplementary Fig. 3D bottom left), Stat3 and ErbB4 are common markers with the IPA results but with ErbB4 expression downregulated in the Pharynx, unlike in the IPA results. For the stroma samples between Pharynx and the Tongue (Supplementary Fig. 3D, bottom right), SPP1 appears to be downregulated in the Pharynx while IL12RB1 (an immune prognostic biomarker for oral squamous cell carcinoma27) is upregulated in the Pharynx, unlike in the IPA where both IL12RB1 and PD-L1 proteins were found to be elevated in the Tongue.
When comparison samples grouped by OS and DFS durations within either compartments or anatomical locations, there are limited DE genes found for most DFS comparisons (Supplementary Table 6). Interesting for OS comparison in the tumour, there are 1794 DE genes obtained of which 41 are common with the IPA results. GSEA analysis suggest downregulation of key hallmark pathways including interferon alpha/gamma, EMT and apoptosis in the patients with a better outcome. There are limited DEGs obtained when comparing the OS groups between samples from different anatomical locations (Supplementary Table 5). Of interest is the downregulation of Histone deacetylase 8 (HDAC8) in patients with good outcomes for Tongue cancers with HDAC8 a potential therapeutic target for treating oral squamous cell carcinoma28.
Cell type deconvolution identify cell types associated with patient survival
Cell type deconvolution was conducted to estimate the proportions of cell types in each sample analysed (Fig. 3A). There are clearly more malignant cells in the tumour compartments compared to the stroma compartments, providing confidence that the segmentation strategy is working reasonably well. In terms of the stroma compartments, major cell types include Plasma, Fibroblast, T cells and Epithelial cells. Comparing differences between cell type proportions did not yield any significant differences (Supplementary Table 7) and we procced to analyse association of the cell type proportions with patient survival. To this end, we applied univariate CoxPH (p < 0.05) and KM (p < 0.05) analyses with cell type proportions as features and associating with DFS and OS events. For tumour samples (Fig. 3B-C), higher proportions of Plasma and Mast cells were found to associate with better outcomes (both DFS and OS) while Lymphovascular cells associate with poorer outcomes. On the other hand, for stroma samples (Fig. 3D-E), B cells, Endothelial and Fibroblast cells all associate with poorer outcomes (DFS and OS) and again higher proportions of Plasma cells associate with better outcomes.
Identification of transcriptomics features associated with patient survival
Using the same feature association analysis approach used in the IPA data, we assess individual transcript’s (feature) expression association with patients’ OS or DFS. For DFS, 24 (27) out of 57 (60) eligible stroma (tumour) cases relapsed and, 17 (19) out of the 57 (60) stroma (tumour) cases died. The feature association analysis result in hundreds of transcripts identified to (with CoxPH wald p-value < 0.01, KM p-value < 0.05) associate with each category (i.e. OS/DFS outcomes based on segments and survival groups) (Supplementary Tables 8 and 9). Looking at the top transcripts identified for each category in either CoxPH or KM analysis (DFS in Supplementary Fig. 4 and OS in Supplementary Fig. 5), significant features of relevance identified includes SLC4A1 (solute carrier family 4 member 1, a transmembrane bicarbonate transporter involved in pH regulation, cell migration and can contribute to oxidative stress dysregulation29) and MSLN (Mesothelin, a glycoprotein that is considered a potential therapeutic target, with some studies finding correlations with immune cell infiltration into the TME, as well, it has also been associated with tumour metastasis, growth and invasion30) and TSPO (low TSPO expression in HNC has been associated with decreased 5 -year survival31) with DFS in the Stroma. NPC1L1 (Niemann-Pick C1-Like 1, contributes to cholesterol absorption and involved maintaining redox balance32) and MNAT1 (menage a trois 1, involved in the PI3K/AKT/mTOR pathway and has been associated with chemoresistance in osteosarcoma33) were seen to be elevated in patients with longer DFS in the tumour segments. Of relevance, given the relationship between MNAT1 and AKT, we found MNAT1 upregulated in the RNA data and AKT was upregulated in the protein analysis.
For overall survival, we found that in the stroma, increased expression of IZUMO1R (known to be expressed in CD4 T cells, and in particular Tregs34), was seen in tumours that had an improved survival (Supplementary Fig. 5). Additionally, increased expression of HS2ST1 (enables proteoglycan interactions in the TME and has been linked with decreased stromal cell infiltration in other 35) in the tumour was found to be associated with decreased survival.
Differences between proteomics and transcriptomics assay
Comparing between the protein and transcript assays, we looked firstly at how well the analytes correlate for matching samples and within each segment. While protein and RNA expression does not always correlate36, we noted that RNA/protein that correlates well (i.e. R > 0.5) are mostly structural (cadherin, keratin and integrin) and antigen recognition genes (Fig. 4A) while most uncorrelated genes are functional genes related to metabolic and chemokine functions. These findings highlight critical differences between utilising assay types to study biological functions.
We then compared the markers identified as having significant association with patient survival between IPA and WTA assays (Fig. 5B-C). With the difference in panel size, it is unsurprising that WTA will have significantly greater number of identified markers and with greater overlaps. For the WTA and within the tumour (stroma), 226 (252) genes were common between OS and DFS, with 303 (166) and 225 (301) unique genes respectively. For the IPA analysis within the tumour (stroma), 19 (7) proteins were common across OS and DFS, with 17 (9) and 19 (14) unique proteins respectively. Only FN1, was found to be common in both assays within the tumour while the genes (or protein encoded by the genes) ADORA2A, SPP1, ACTA2, CTNNB1 and HSP90AA1 were common in both assays within the stroma (Fig. 4C). As is expected, most of these are structural proteins which correlates well between the assays.
Through assessing both the IPA and WTA assays, we were interested in uncovering what markers might be associated with clinical response that were unique to protein expression and not found within the WTA (Supplementary Table 10). These markers may be more translatable in the translational setting which, with the more focused IPA panel, can be appropriately identified in the analysis. We identified several lymphocyte markers unique to the IPA analysis that were linked to alternate clinical outcomes within the tumour regions. Firstly, we identified high intra-tumoral CXCR5 expression in patients with improved outcomes. CXCR5 is a chemoattractant cytokine receptor that is typically expressed on blood and peripheral lymph node B-cells and some subsets of T cells and is involved in the trafficking of these lymphocytes from the blood to lymphoid organs37. Consistent with our findings, CXCR5 expression in HNSCC has been associated with the positive infiltration of lymphocytes and has been correlated with improved survival outcomes20,38. There is extensive research to suggest that lymphocytes play a crucial role in anti-tumour immunity and patient outcomes. CD3e, a hallmark lymphocyte marker, was also found to be upregulated within the tumour of patients with an HR < 1 for both disease-free survival and overall survival (Fig. 2C, Supplementary Table 10). While lymphocytes can be anti-tumorigenic, there are subsets that can have pro-tumorigenic effects, and therefore characterising the subtypes of lymphocytes present within the TME allows us to better appreciate the dynamics and state of the disease. Using the IPA panel, we were able to identify functional lymphocyte markers that were associated with different patient outcomes, and that were not differentially expressed in their RNA form. Cytotoxic T lymphocyte antigen 4 (CTLA-4) is an immune checkpoint marker and is expressed primarily on T cells, with CTLA-4 + Tregs having the ability to produce immunosuppressive molecules that induce T cell dysfunction, exhaustion, and negatively regulate the immune response39. Furthermore, clinical trials using CTLA-4 inhibitors as an immunotherapy in conjunction with radiotherapy and cetuximab, have shown benefit in a subset of patients that are not highly expressing PD1, LAG3 or CD3940. Within our data we found higher levels of CTLA-4 within the tumour regions of patients with a worse disease-free survival and overall survival (Fig. 2D, Supplementary Table 10). In addition, we see higher levels of FOXP3, commonly expressed on Tregs34, in both poor DFS and OS (Fig. 2C, Supplementary Table 10). From these findings we infer that these patients have an increased infiltration of immunosuppressive Tregs and exhausted T cells. Given that we see tumour infiltrating lymphocytes present in both poor and good survival, being able to functionally characterise into immune subtypes is prognostically valuable.
Immunofluorescent- proteomic validation
From the list of significant markers identified from the IPA dataset for DFS and OS, two proteins, CD34 and CD44, were amongst the proteins found to contribute to clinical outcomes and also featured in the PCF panel. To validate the findings from the IPA analysis, we analysed a single cell multiplex PC dataset from a serial section to orthogonally validate the results. For the validation using PCF, data were acquired and during quality control, a few cores were excluded from downstream analysis due to poor tissue staining/quality and artifacts (substantial blurs/bubbles). We also noted that there are inherent differences between the two assays for instance, the IPA assay looks at total intensity across a region, based on quantified expression, whereas the PCF data uses single cell binary classifications of cells based on quantification of immunofluorescent intensity signals. However, despite these caveats, we were able to validate the findings across the technologies.
We found that in the IPA analysis CD44 had a HR of > 1 in the tumour and was associated with DFS and OS. Additionally, CD34 was found to be associated with a HR of > 1 in the tumour for DFS (Fig. 2, Supplementary table 3). Across various cancer types, including HNSCC, CD44 expression is often considered a cancer stem cell marker and is associated with aggressive tumours, disease reoccurrence and worse patient outcomes41. Interestingly, within our cohort we found CD44 expression differentially expressed in patients with better survival. Contrary to most current research, across both proteomic technologies (IPA and PCF), high CD44 expression in the tumour was associated with improved survival (HR < 1). When we compared the survival of patients in lowest and highest quartiles (Low: n = 13, High: n = 13) for CD44 + PanCk + cells within the tumour (Fig. 5A) we found that patients had an increased DFS (p = 0.05) (Fig. 5B) and OS, although not significant, (p = 0.14) (Supplementary Fig. 6) associated with increased proportion of CD44 + PanCk + cells, validating the findings from the IPA analysis (Fig. 2C) where CD44 expression is associated with good OS and DFS outcomes in the tumour segment.
CD34 protein is a hematopoietic and endothelial stem cell marker, involved in the formation of blood vessels during injury response22,42. CD34 is also thought to facilitate the adherence of specialised cells to lymphocytes, as well as being involved in tumour angiogenesis, the promotion of tumour reoccurrence and metastasis, and a marker on fibroblast progenitors22. It has been found that in the peripheral blood of HNSCC patients, CD34 + cells depressed the functions of T-lymphocytes, potentially through the release of immunosuppressive cytokine transforming growth factor-β, negatively impacting patient outcomes23. In our study, for both the IPA and PCF datasets, CD34 was found to be associated with poor DFS. Specifically, CD34 expression was found to be upregulated in the tumour region of patients with worse DFS in the IPA dataset (Supplementary Table 3). Using the PCF data, we assessed the proportion of CD34 + PanCk- cells within the tumour masked regions and conducted the survival analysis to validate the findings from the IPA analysis (Fig. 5C). Samples that are lower than the 25th percentile (n = 13), or greater than the 75th percentile (n = 13) we see a significant difference (p = 0.013, Fig. 5D) in line with the results found for the IPA analysis.