Human BrCa cell lines
Human breast cancer cell lines representing the three major subtypes luminal BrCa cell lines (MCF7 and T47D), TN BrCa cell lines (MDA-MB-231, MDA-MB-468, MDA-MB-157 and HBL100) and the HER2-enriched BrCa cell line (SK-BR3) were cultured in RPMI (Life Technologies) containing 10% fetal calf serum, 1% antibiotic and antimycotic, and were maintained at 37°C with 5% CO2 in a humidified atmosphere. Cell lines were authenticated by the Garvan Institute of Medical Research using short tandem repeat DNA profiling and were found to be >93% concordant.
The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort
The data used in this study consists of transcriptomic (cDNA microarray) information processed using the Illumina HT-12 v3 platform (Illumina_Human_WG-v3). Gene expression values of primary breast tumors were extracted from luminal (1140 samples), HER2-enriched (220 samples) and TN (199 samples) subtypes and from healthy controls (HC) tissues (144 samples) (22).
Patient cohort
Two cohorts of BrCa clinical samples used in this study were sourced from the Victoria Cancer Biobank consortium, the Australian Breast Cancer Tissue Bank, or the Strathfield Breast Centre. Cohort 1 comprised 506 sera samples from 408 BrCa patients with luminal, HER2-enriched and TN subtype and 98 HC. Cohort 2 consisted of 30 formalin-fixed tumor tissues from TN, HER2-enriched or luminal subtype BrCa patients. As shown in Table 1, the groups were well distributed in both cohorts. Mean age differences between BrCa groups were well controlled except for HC which was significantly younger (p<0.0001).
All samples used in this study were from female patients who were diagnosed with primary BrCa as their first cancer event. Blood was collected before surgery and tumor tissues were collected before chemotherapy treatment. Estrogen, progesterone receptor and HER2 status were determined by qualified pathologists using immunohistochemistry. HC sera were sourced from the Australian Breast Cancer Tissue bank.
mRNA extraction and qPCR
To determine gene expression along the KP, total mRNA was extracted with the RNeasy Mini Kit (Qiagen) according to the manufacturer’s instructions. Following extractions, the quantity and quality of the total mRNA were measured using the Nanodrop 2000 (Thermo Fisher Scientific). For cDNA synthesis, 2 μg of total mRNA was reverse transcribed with Superscript VILO cDNA Synthesis Kit (Thermo Fisher Scientific) according to the manufacturer’s instructions. qPCR reactions were performed in a final volume of 10 μl with each reaction mix containing 5 μl Fast SYBR® green master mix, 5 μM forward and reverse primers, and 125 ng of cDNA template in the Viia7 (Thermo Fisher Scientific). The reaction was incubated at 95oC for 20 seconds, then amplified for 40 cycles of 95oC for 1 second and 60oC for 20 seconds. A melting curve was generated at the end of each reaction to confirm that only one product was formed. The mRNA expression levels of KP genes were normalized to tubulin binding protein (TBP) and made relative to the untreated control condition using the 2-rrCT method. The sequences and efficiency of qPCR primers are generated in accordance with the MIQE PCR Guidelines(23) and are shown in Supplementary Table 1.
Protein lysate preparation and Western Blot Assay
Cells were plated to achieve 70% confluency and treated for 48 hrs with IFN-g (specific activity: 1 X 107 IU/mg; Miltenyi Biotech) or RPMI media as control. The cells were then lysed in a buffer containing 20 mM Tris-HCL (pH 8.0), 137 mM NaCl, 1% NP40, 10% glycerol and 1x Protease inhibitor cocktail (Promega). Protein concentrations were measured by PierceTM BCA protein assay kit (ThermoFisher Scientific). NuPAGE® sample reducing agent (Thermo Fisher Scientific) and Laemmli buffer (BioRad) were added to the samples and heated to 70°C for ten minutes. Denatured samples were transferred onto ice before separation by electrophoresis on a kD™ Mini-PROTEAN® TGX protein gel (BioRad). The proteins were then transferred to nitrocellulose membranes and blocked with 5% skim milk for an hour. Blots were probed overnight at 4°C with primary antibodies: IDO1 (1:1000; clone: UMAB126, Origene), KMO (1:1000; LSBio), kynureninase (KYNU) (1:500; clone: OTI1H1, Origene) and actin (1:1000; Abcam). Secondary anti-mouse (1:10,000; Dako) and anti-rabbit (1:12,000; Dako) antibodies were incubated for its corresponding primary antibody for an hour before developing with Clarity™ Western ECL substrate (Bio-Rad).
Quantification of KP metabolites
Prior to analysis, 150 µL of biological fluids were deproteinized with 10% (w/v) trichloroacetic acid in equal proportions. Samples were incubated for 5min, vortex then centrifuged (4oC) for 10 min at 12,000 rpm. Supernatant were then extracted and filtered with 0.22μm syringe filters (Millex, Merck) ready for injection into analyzers.
Concurrent quantification of TRP, KYN, 3-HK, 3HAA, and AA was carried out as previously described(24). Briefly, 20 µL of the filtered extract was injected into the analyser. Separation of metabolites was performed under the stable temperature of 38oC for 12 min, using 0.1 mM sodium acetate (pH 4.65) as the mobile phase, with an isocratic flow rate of 0.75ml/min in an Eclipse Plus C18 reverse-phase column (2.1 mm x 150 mm, 1.8 μm particle size, Agilent). 3HK and KYN were detected using UV wavelength at 365nM. TRP, 3HAA and AA were detected using fluorescence intensity set at Ex/Em wavelength of 280/438 for TRP and 320/438 for 3HAA and AA. Mixed standards of all metabolites were used for a six-point calibration curve in order to interpolate the quantity of the sample readout. Agilent OpenLAB CDS Chemstation (Edition C.01.04) was used to analyze the chromatogram. The inter- and intra-assay coefficient of variation is within the acceptable range of 3-7%. Concentrations of KP metabolites in cell culture media were calculated by subtracting the values of pre- and post- treatment concentrations.
Immunohistochemistry and scoring of staining
Formalin-fixed paraffin embedded tissue sections (8 μM) were purchased from the Victoria Cancer Biobank. Sections were deparaffinized and rehydrated through graded alcohols to water. Antigen retrieval was performed by boiling the de-paraffinized sections in specific buffers according to each antibody. After placing the slides onto a chamber stacker, they were rinsed thrice with wash buffer (Dako). Endogenous peroxidase activity was blocked with a 10 min incubation of Dual Endogenous Enzyme-Blocking Reagent (Dako). Thereafter, the slides were rinsed with wash buffer (Dako) and blocked with 5% BSA (Sigma Aldrich) in PBS-T (PBS with 0.2% Tween20) for 1 hr at room temperature. The primary KP enzyme antibodies (IDO1, 1:100 and KYNU, 1:100 antibody as mentioned above in western blot; KMO, 1:100 (Sigma Aldrich) and isotype control antibody (IDO1 isotype control IgG1/clone DAK-GO1 (Dako), KMO isotype control rabbit IgG (Abcam), KYNU isotype control IgG2b/clone DAK-GO9 (Dako) were applied overnight at 4oC. After primary antibody incubation, sections were washed and incubated for 1 hr with peroxidase-labeled secondary antibody specific for each primary antibody.
The slides were scored numerically by three blinded researchers and a composite staining score was calculated based on two categories: (1) the percentage of tumor stained positive (0= 0%, 1= 1 - 33%, 2= 34 - 66%, 3 >66%), and (2) intensity of protein staining (0, 1, 2, 3). Differences in scores were adjudicated between the researchers to arrive at a final score.
Statistical analysis and modelling
Descriptive statistics were used to identify outliers, missing data, and normality of KP variables and demographics. Where needed, data normalization was performed prior to analysis. Exploratory data analysis involving multiple groups or case-control comparison was performed using one-way ANOVA and t-test, respectively. Differences in the expression/level of variables of interests were considered significant if p<0.05.
To develop an algorithm that can potentially discriminate the BrCa subtypes based on predictors (i.e. variables of interests identified during exploratory analysis), a supervised machine learning approach using various classification models was applied. These models included the Classification and Regression Tree, Neural Networks(25), Support Vector Machines, Discriminant Analysis and C5.0 Decision Tree(26, 27) that were previously described for a similar study design(24). First, we randomly split the dataset into training (77%) and test (23%) sets. Then, an iterative model building framework approach was used to find the best model for our aim. A classification model was considered successful when it had the highest predictive accuracy for specific subtype observations. In addition to accuracy, we calculated the class specific lifts, sensitivity and specificity of the models. To minimize overfitting of the model, a 10-fold cross validation and pruning set at 75% was implemented during the analysis.
All statistical analyses were performed using the R software (R Core team 2015), with the R package (28) and illustrated with Prism 8 (GraphPad) and Excel. All classification modelling was developed using IBM SPSS Modeler (version 18.0, 2016).