Patient cohorts
This study was performed in accordance with ethical guidelines for clinical research with the approval of the ethics committee for each institution. At Huntsville Clearview Cancer Center (“HV”), all patients included in the study underwent surgery for histologically proven stage II colon cancer between January 1997 and December 2008. For Memorial Sloan Kettering Cancer Center cohorts (“MSK1” and “MSK2”), patients’ exclusion criteria were as follows: postoperative mortality within 30 days; a limited follow-up period of less than 3 years in cases without recurrence; synchronous multiple cancers; positive surgical margins; concomitant inflammatory bowel disease; and familial adenomatous polyposis or previous malignancy within 5 years. No patient received preoperative chemotherapy or radiotherapy.
Combining cohorts for integrated discovery analysis
To evaluate the association of biomarkers and cell clusters with outcomes (recurrence within 5 years) and treatment response, we carefully combined two cohorts (HV and MSK1) in a way that minimized the effects of bias as a confounding factor. We evaluated two aspects to ensure balance between the two cohorts: 1) balanced clinicopathological risk factors and 2) biomarker comparability. For clinicopathological risk factors, Chi-square tests were performed to determine whether there was any dependency between the risk factors and treatment status for each cohort. Available risk factors included age, sex, tumor location, T stage, tumor differentiation, and nodal count. There was a high number of T3 patients among untreated patients in the MSK cohort, and 197 patients were randomly filtered out to achieve a balance of T3 patients in each group. Secondly, biomarker intensity was evaluated between the two cohorts. We firstly implemented utilized our validate normalization procedure42 to minimize the batch effect among different slides. We also needed to be cautious about any bias derived from cohort differences. Therefore, to test whether cohort differences were significantly larger than slide differences within the cohorts, JSD (Jensen Shannon Divergence43) was used to measure the distance between the two distributions. This allows us not only to evaluate the mean difference but also measure the difference of the entire distribution. To compare the distances among the slides within a cohort vs. across the two cohorts, we used Wilcox test with multiple hypothesis testing adjustment using Benjamini & Hochberg44 method.
Antibody validation and multiplexing workflow
All antibodies used in this study were subjected to a standardized characterization process using a tissue microarray (TMA) and appropriate controls to evaluate the specificity and sensitivity of the primary antibody and its dye-conjugated derivative, including the cyclic testing of the dye inactivation treatment compared to singleplex staining. This protocol is described in detail on protocols.io22 and in Gerdes et al (2013)20. All markers and antibody clones and final concentrations are shown in Supplementary Table 1. Multiplexed immunofluorescence (MxIF) staining of the skin samples was performed as previously described 20 using Cell DIVE™ technology (Leica Microsystems, Issaquah, WA). After de-paraffinization and a two-step antigen retrieval, the FFPE slides were stained with DAPI and imaged in all channels of interest to acquire background autofluorescence (AF) of the tissue. This was followed by indirect detection and/or direct conjugate antibody staining of up to 3 markers per round plus DAPI, dye deactivation, and repeat staining to collect images of all planned biomarkers. Multiplexed images are automatically registered and processed for illumination correction and autofluorescence subtraction during each round of imaging using the Cell DIVE image processing workflow.
Image and data processing
Cells in the epithelial and stromal compartments were segmented using DAPI, pan-cytokeratin, S6, and NaKATPase, as previously described20,23. Images for every core were then reviewed for tissue quality (tissue loss or damage) and image segmentation quality. Images not passing criteria (poor biomarker staining, too few cells, or poor segmentation due to damage or poor staining of segmentation markers) were excluded from data analysis. Several quality control steps were conducted: 1) cell filtering based on the following criteria: epithelial cells required to have 1-2 number of nuclei; 2) each sub-cellular compartment (nucleus, membrane, cytoplasm) area > 10 pixels < 1500 pixels; and 3) cells in each round of staining have to have good alignment (minimum 80% for Huntsville cohort, and 85% for MSK cohorts) with first round of staining (automatic tissue quality index=1 at each round, which is the correlation between each image and the DAPI image). After quality review, the data was further processed including normalization to remove batch effects42, and log2 transformation to handle the skewness of the marker intensities.
K-means clustering of intrinsic and extrinsic pathway proteins
We performed K-means clustering of markers in the intrinsic and extrinsic pathways to quantify the distributions of multiple apoptosis proteins at cell level. Extreme values (1% on both tails) were capped and standardized with zero mean and unit variance. Then, we applied K-means clustering with group size k=2,…,15. To determine the number of clusters that best represented the data, we also performed consensus clustering 45, then evaluated PAC (proportion of ambiguously clustered) and heatmaps. Once the number of groups was determined, a patient level cluster profile was calculated, which is the proportion of each cluster group within the entire cell population of a patient. The cluster profile at patient level was used to evaluate the correlation with the outcome (5 years recurrence) using Cox proportional model. Specifically, we performed the clustering in three sets of biomarkers: intrinsic and extrinsic pathway combined, and the intrinsic extrinsic pathways separately. For the integrated pathway model, we down-selected the markers using Davies-Bouldin's index24 to measure the contribution of each individual markers to the cluster separation from the rest of the cells. To evaluate the cohort bias, we also performed K-means clustering after scaling within the cohort separately.
DR_MOMP ODE model to calculate MOMP sensitivity
Protein levels of BAK, BAX, BCL2, BCL(X)L, and MCL1 were normalized to the mean protein levels in HeLa cells spotted on separate slides that were stained alongside slides used in this study. Protein molar concentrations were calculated using previously established HeLa concentrations and used as input for DR_MOMP as previously described25. DR_MOMP was translated from its MATLAB implementation to C++ and R using deSolve (1.28), doParallel (1.0.15), and Rcpp (1.0.5). For each core the maximum % level of pores was calculated after simulating an approximated mean stress dose (200 nM; estimated from the patient population as a threshold17,25,46. Cells were considered to have low sensitivity for MOMP if the % level of pores was <10%25. We found a Pearson correlation coefficient of −0.82 (95% CI −0.83 to −0.82; p < 0.001) between the % level of pores and the (previously) used stress dose required for MOMP (log-transformed) calculated for 10,000 randomly chosen cancer cells25.
APOPTO-CELL model to calculate caspase-3 cleavage
The APOPTO-CELL model10 was implemented in the MATLAB environment equipped with Statistics and Parallel toolboxes (version 2014b, The MathWorks, Inc., Natick, MA, USA). Protein concentrations to use as input for the APOPTO-CELL model, namely procaspase-3, procaspase-9, SMAC, and XIAP, were estimated by aligning the signal intensities in arbitrary units to molar concentrations (in uM) with an established pipeline18 using as a reference protein molar profiles determined in a clinically-relevant CRC cohort47, as previously described 18,40,41. Previous research 40,48 has shown APAF1 not to be the limiting factor in apoptosome formation in the CRC settings. Thus, APAF1 cell-specific protein levels were set to the median expression (0.123 uM) previously determined in clinically relevant CRC cohort 48, as previously described 18,40,41.
Statistical analysis
Univariate analysis was performed at patient level correlating the outcome and risk of recurrence within 5 years. For intrinsic/extrinsic pathway markers, we calculated the average intensity of all cells in the entire epithelial mask per patient, including across multiple cores for the same patient, when available. After binarizing the continuous scale average using median cut point, Cox proportional hazard model was fitted for each biomarker on treated and untreated patients separately. Multiple hypothesis test correction using Benjamini & Hochberg44 method was applied. Inter-marker correlations were evaluated at cell level as well as a patient level (using average values), which was used for data QC and to confirm expected biomarker associations. Following univariate analysis, stepwise feature selection (My.stepwise.coxph function in R) was used to optimize the Cox-proportional hazard model performance.