RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, prof. Owen Sansom ([email protected]).
Materials availability
This study did not generate new unique reagents.
Data and code availability
-
Exome sequencing data are available through the Sequence Read Archive (SRA) database, accession number PRJNA983771.
-
Single-cell RNA-sequencing data have been deposited in Gene Expression Omnibus (GEO) database and are publicly available, accession number GSE234511.
-
RNA sequencing data have been deposited in the GEO database and are publicly available, accession number GSE234501.
-
This paper does not report original code.
-
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Animals
Species used: Mus musculus
Tumor Models and Treatments
All animal experiments were performed under UK Home Office licence (Project Licenses 70/8646, 70/9112 and PP3908577), and were approved by the animal welfare and ethical review board of the University of Glasgow. Researchers adhered to ARRIVE guidelines upon reporting the experiments. The experiments were not randomized but the researchers were blinded to the mouse genotypes/treatments during data collection. Mice were genotyped by Transnetyx. Mice were housed in a specific pathogen-free facility in conventional cages with a 12-hour light/dark cycle and freely allowed access to diet and water. Mice of both sexes were included in the experimental cohorts (unless otherwise stated in the figure legend). Recombination of floxed alleles was induced with a single intraperitoneal injection of 2 mg tamoxifen (Sigma-Aldrich, T5648) into mice aged 6–12 weeks and weighing at least 20 g. For intracolonic inductions, a single dose of 70 μl of 100 μM 4-hydroxytamoxifen (Sigma, H7904) was injected into the colonic submucosa using a TelePack VetX LED endoscope (Karl Storz). For AOM-induced tumorigenesis in the colon, AOM treatment was started 4 days post tamoxifen-induction, with mice subsequently given a single intraperitoneal injection of 10 mg/kg AOM once weekly for 6 consecutive weeks. All experiments were performed on a C57BL/6J background. The alleles and transgenes used were : villin-creERT249, ApcMin 50, Apcfl 51, KrasG12D 52, Trp53fl 53, Rosa26N1icd 54, iCCR- 25, Ccr2- 55, iCCR-reporter12. Mice were sampled at clinical endpoint, which was defined as weight loss and/or paling feet and/or hunching and/or bleeding and/or diarrhea. Tumor-free mice were censored 12 months after tamoxifen administration or for health issues unrelated to an intestinal tumor burden (mainly skin wounds in iCCR-/- mice or lymphomas in AOM-treated mice). The CSF1R-inhibitor AZD7507 (obtained from AstraZeneca), CXCR2-inhibitor AZD5069 (obtained from AstraZeneca), and CCR1-inhibitor BL5923 (obtained from Novartis) were administered by oral gavage at 100 mg/kg (25 mg/ml) twice daily, and the CCR5-inhibitor maraviroc (Medchemexpress, HY-13004) was administered twice daily at 50 mg/kg (12.5 mg/ml). Drugs were resuspended in 0.5% hydroxypropyl methylcellulose (HPMC; Sigma, G8384) and 0.1% Tween-80 (Sigma, P8074). As vehicle control for AZD7507, AZD5069, BL5923, and maraviroc, mice were given 0.5% HPMC and 0.1% Tween-80 under the same regimen. Treatments were started at day 70 for ApcMin/+ mice, the day after tamoxifen induction for AK mice, 85 days post tamoxifen injection for KPN mice; one week after the last AOM dose for AOM-treated mice, or after tumors were detected by colonoscopy in mice intracolonically induced with tamoxifen or implanted with organoids (1–2 weeks post induction or organoid injection). Short-term treatments of tumor-bearing mice were started when mice showed signs of tumorigenesis (5% weight loss and/or bleeding and/or diarrhea).
Patient cohorts
Four different patient cohorts were used to assess the clinical value of the macrophage signature derived from the PN mouse model. The Academic Medical Center (AMC) cohort data are available at GSE33113. The Almac cohort, consisting of 215 stage II colon cancer patients, was downloaded from ArrayExpress (E-MTAB-863). The processed TCGA COREAD RNA-Seq dataset (n = 577) was downloaded directly from the Guinney et al. CMS-classifier19 study via Synapse (ID: syn2023932), where it has been described previously.
The GRI cohort is a retrospective cohort consisting of stage II–III CRC patients and utilised for validation purposes (n=787)56. Patients were staged using the 5th edition of TNM staging and underwent surgical resection with curative intent between 1997–2013 within the Greater Glasgow and Clyde NHS board. Patients who received neoadjuvant therapy or died within 30 days of surgery were excluded from analysis. Data were deposited with Glasgow Safehaven (#GSH21ON009) and ethical approval was in place (MREC/01/0/3). To determine the CMS class, transcriptional profiling was performed using formalin-fixed paraffin-embedded (FFPE) tissue resections through TempO-Seq technology (Bioclavis Ltd.) as previously described57.
METHOD DETAILS
Scoring of tumor burden and tumor stage
The intestines were opened longitudinally, pinned onto plates, and fixed with formalin overnight at room temperature. Tumors were counted on the following day, and the tumor sizes of tumors were scored in mm to calculate the individual tumor area. Total tumor burden was calculated by adding the areas of individual tumors present in the small intestine (SI) and colon. T-staging of tumors was performed according to the following parameters included in the classical tumor/node/metastasis (TNM) classification: Tis, T1–T2, tumors present and invading into the muscularis propria; T3, tumors invading into the subserosa; T4, tumors invading/perforating the visceral peritoneum and into other adjacent organs/structures.
Blood count analysis
Blood was collected in EDTA-coated tubes after cardiac puncture of mice culled by carbon dioxide inhalation. Blood samples were analyzed with the IDEXX ProCyte Dx automated hematology analyzer following the manufacturer’s protocol.
Organoid culture
First, ADF medium was prepared by supplementing Advanced DMEM/F12 with penicillin/streptomycin (100 U/ml and 100 mg/ml, respectively) (15140122), 2 mM L-glutamine (25030081), 10 mM HEPES (15630080), N2-supplement (17502001) and B27-supplement (17504044) (all from Gibco, Life Technologies/Thermo Fisher Scientific). To prepare complete ADF, ADF was supplemented with 50 ng/ml recombinant human EGF (Peprotech, AF10015), 100 ng/ml recombinant murine Noggin (Peprotech, 25038) and 500 ng/ml recombinant mouse R-spondin 1 (Biotechne, 3474-RS).
A single PN tumor or a tumor piece was excised at the time of dissection and placed in PBS over ice until further processing. Tumors were cut into small fragments using sterile scalpels and washed in PBS. Tumor fragments were incubated in 5 ml of 10× trypsin (5 mg/ml, Gibco, 15090046) and 200 U recombinant DNase I (Roche, 04716728001) for 30 minutes at 37 °C. To further dissociate tumor fragments, 10 ml ADF was added and tumor fragments were shaken vigorously. The suspension was passed through a 70 µm cell strainer using a 5 ml syringe plunger to mash the tumor over the strainer. Cells were pelleted by centrifugation and the cell pellet was resuspended in a volume of Matrigel (BD Bioscience, 356231), dependent on the organoid pellet volume and seeded 150 µl in one well of a 6-well plate. Organoids were cultured in complete ADF at 37 °C, 5% CO2, 21% O2, and passaged or collected after reaching 80% confluency (2–3 days post seeding). For RNA or DNA extraction, cells were pelleted and stored at -80 °C until further processing.
Needle-guided intracolonic organoid transplantation
Colonic sub-mucosal injections of organoids were performed as previously described 31, using a Karl Storz TELE PACK VET X LED endoscopic video unit. KPN (villin-creERT2 KrasG12D/+ Trp53fl/flRosa26N1icd/+) organoids were derived from KPN liver metastases while AKPT (villin-creERT2 Apcfl/flKrasG12D/+Trp53fl/flTgfbr1fl/fl) organoids were derived from an autochthonous AKPT SI tumor. Organoids of both genotypes were cultured, as described above, in complete ADF medium without R-spondin 1. Organoids were harvested upon reaching 80% confluency, dissociated by vigorous pipetting, and washed twice with PBS before injection. Approximately 500 organoids in 70 µl PBS were injected in a single injection. All experiments with organoids were performed before passage 20. Sex-matched organoid lines were used when single-sex experiments (male or female) were performed. Organoid lines were tested for mycoplasma using MycoAlert (Lonza LT07-218) before in vivo injection.
Histology and immunohistochemistry
Hematoxylin and eosin (H&E) staining, immunohistochemistry (IHC), and RNAScope in situ hybridization (ISH) were performed on 4 µm FFPE sections, which had previously been heated at 60 °C for 2 hours. All IHC and ISH staining was performed on a Leica Bond Rx autostainer.
Sections for F4/80 (Abcam, ab6640) IHC staining underwent on-board dewaxing using Bond Dewax solution (Leica Biosystems, AR9222) and epitope-retrieval using Enzyme 1 solution (Leica Biosystems, AR9551) for 10 minutes at 37 °C. Sections were washed with Leica BondTM wash buffer (Leica Biosystems, AR9590) before endogenous peroxidase was blocked using a BOND Intense R Detection kit (Leica Biosystems, DS9263). Sections were rinsed with wash buffer before the application of the blocking solution from the goat anti-rat ImmPRESS kit (Vector Labs, MP-7404) for 20 minutes. Sections were subsequently rinsed three times with wash buffer and F4/80 antibody was applied at 1:200 dilution for 30 minutes at room temperature. The sections were again rinsed three times with wash buffer before the application of a goat anti-rat ImmPRESS secondary antibody (Vector Labs, MP-7404). Sections were washed with wash buffer and staining was visualised using DAB and counterstained with hematoxylin from the BOND Intense R Detection kit.
ISH detection for Ccr1 (402728), Ccr2 (501688), Ccr3 (576351), Ccr5 (438658), Mm-Ppib (positive control probe; 313918) and dapB (negative control probe; 312038) mRNA (all from Bio-Techne) was performed using RNAScope 2.5 LSx Reagent Kit-BROWN (322700; Bio-Techne) according to the manufacturer's instructions.
H&E staining was performed on a Leica autostainer (ST5020). Sections were dewaxed in xylene, taken through graded ethanol solutions, and stained with Haematoxylin ‘Z’ stain (CellPath, RBA-4201-00A) for 13 minutes. Sections were washed in water, differentiated in 1% acid alcohol, washed and the nuclei were stained blue in Scott’s tap water substitute (in-house). After washing with tap water, sections were placed in Putt’s eosin (in-house) for 3 minutes.
The stained sections were mounted with a coverslip in xylene using DPX mountant (CellPath, SEA-1300-00A). A minimum of three biological replicates were analyzed and representative images are displayed in the figures. The researchers were blinded to the mouse genotypes and analyzed random fields of tissue sections.
Multiplex immunoflourescence
The OPAL protocol and fluorophore reagents (Akoya Biosciences) were used to perform multiplex immunofluorescence staining on 4 μm thick FFPE sections. The staining was conducted on a Leica BOND RXm autostainer (Leica Microsystems). A total of six consecutive staining cycles was performed using primary antibody–Opal fluorophore pairs.
The macrophage-staining panel comprised the following antibodies: (1) F4/80 (1:100, Abcam, ab111101)-Opal 620; (2) CD4 (1:500, Abcam, ab183685)–Opal 690; (3) COX2 (1:1000, Cayman Chemical, 160126)–Opal 520; (4) VEGFA (1:200, Abcam, ab52917)–Opal 480; (5) SPP1/osteopontin (1:750, Abcam, ab218237)–Opal 570; and (6) E-cadherin (1:500, Cell Signaling, 3195)–Opal 780.
The tissue sections were incubated with primary antibody for an hour at room temperature, with the BOND Polymer Refine Detection System (Leica Biosystems, DS9800) used to detect antibody binding. Following the manufacturers’ protocol, opal fluorophores were used instead of DAB, and the haematoxylin step was omitted. Antigen retrieval was performed using the BOND Epitope Retrieval Solution one or two (Leica Biosystems, AR9961) at 100 °C for 20 minutes, in accordance with the standard Leica protocol, before each primary antibody was applied. The sections were subsequently incubated with spectral DAPI (Akoya Biosciences, FP1490) for 10 minutes and mounted with VECTASHIELD Vibrance Antifade Mounting Medium (Vector Laboratories, H-1700-10) onto glass slides. The Akoya Biosciences Vectra Polaris slide scanner was used to obtain whole-slide scans and multispectral images. Batch analysis of the multispectral images from each case was performed using inForm 2.4.8 software, and the resulting batch-analyzed images were combined using the HALO image analysis platform (Indica Labs) to produce a spectrally unmixed reconstructed whole-tissue image. Cell-density analysis was subsequently conducted for each cell phenotype across the tissues of interest using HALO.
Metastasis scoring
H&E sections of lymph nodes, liver, lung, kidney, spleen, pancreas, diaphragm, and peritoneal adipose tissues were examined microscopically for evidence of metastasis, with the researcher blinded to the genotype.
Flow Cytometry and FACS
Tumor samples were dissected and digested using the Mouse Tumor Dissociation Kit (Miltenyi Biotec, 130-096-730) and the gentleMACS Octo Dissociator with Heaters (Miltenyi Biotec, 130-096-427), using the 37C_m_TDK_1 gentleMACS program. Digestion was stopped by the addition of 10 ml RPMI 1640 medium (Gibco/Thermo Fisher Scientific, 31870-025) supplemented with 10% FBS and 2 mM EDTA. Next, the suspension was dispersed through a 70 μm cell strainer to produce single-cell suspensions. Cells from 10 mg tumor were pelleted by centrifugation at 900 ×g for 2 minutes and subsequently stained with 100 µl of LIVE/DEAD Fixable Near-IR Dead Cell Stain Kit (Thermo Fisher Scientific, L10119) at 1:1000 dilution in PBS and incubated in the dark for 20 minutes at 4 °C, and then washed with 1% BSA in PBS. Cells were resuspended in 50 μl of Fc Block (1:200 in 1% BSA in PBS; BD Pharmingen, 553141) and incubated in the dark for 10 minutes on ice. Then, 50 μl of the antibody-staining mixes (at 2× concentration) were added (see Table S1 for details of antibody-staining mixes used for myeloid and lymphoid cell characterisation). The cells were incubated in the dark for 30 minutes at 4 °C, then washed with PBS and resuspended in 50 µl PBS. To fix the cells, 50 µl of 4% paraformaldehyde in PBS was added on top of the cells without mixing and incubated in the dark for 10 minutes at room temperature. The cells were washed and resuspended in 400 µl PBS for the acquisition of data using the BD LSRFortessa Cell Analyzer (BD Biosciences). Immune cell populations were identified using the FACSDiva software, with the following gating strategy: cells in a FSC-A and SSC-A plot; doublet discrimination by discrepancy between FSC-A and FSC-H signals; Live/Dead negative staining for live cells; CD45+ for all immune cells. The T cells were identified as CD3+, then CD8+CD4- population was identified to mark the CD8+ T-cell population. Neutrophils were identified as CD3-CD19-CD11b+CD48-lowLy6G+. Macrophages were identified as CD3-CD19-CD11b+Ly6G-F4/80+CD64+. Dendritic cells were identified as CD3-CD19-Ly6G-MHCII+CD11c+F4/80-, then examined for CD11b and CD103 expression.
For staining tumors from iCCR-reporter mice, single-cell suspensions were stained for 15 minutes at 4 °C with 100 μl of Fixable Viability Dye (eBioscience), and Fc receptors were blocked using FcR Blocking Reagent (Miltenyi Biotec). Cells were washed in FACS buffer (PBS with 2 mM EDTA and 2% FBS) and stained for 20 minutes at 4 °C with 100 μl of antibody cocktail containing subset-specific antibodies (please see Table S1 for details of antibodies used). Cells were analyzed fresh as paraformaldehyde fixation can affect reporter protein fluorescence. Photomultiplier tube voltages and compensation for cell surface markers were determined using UltraComp eBeads (eBioscience) as single-color compensation controls. HEK293T cells transfected with a single fluorescent protein were used as single-color compensation controls to adjust voltages and compensations for each fluorescent reporter.
Stained samples were analyzed on a BD LSRFortessa flow cytometer (BD Biosciences) and data analysis was performed using FlowJo software v10.4.2 (FlowJo).
For sorting macrophages, tumors weighing at least 40 mg were digested to generate single-cell suspensions, which were subsequently pelleted, resuspended in 150 μl of Fc Block (BD Pharmingen, 553141) diluted 1:200 in 1% BSA in PBS, and incubated in the dark for 10 minutes on ice. Following the addition of antibody-staining mixes (150 μl of 2× concentrated), the samples were incubated in the dark for 30 minutes on ice (see Table S1 for details of the antibody staining mix). Cells were then washed with 1% BSA in PBS, pelleted, resuspended and passed through a 70 µm filter. DAPI was added to the cell suspension just before sorting. Macrophages were sorted using a BD FACSAriaTM cell-sorter and gated as DAPI-CD45+CD3-CD19-Ter119-CD11b+CD48+Ly6G-CD64+F4/80+. Sorted cells were pelleted and stored in 350 µl RLP buffer from the RNeasy Micro Kit (Qiagen, 74004) at -80 °C until RNA extraction.
DNA extraction
DNA was extracted from organoids generated from PN tumors and matched tail samples using DNeasy Blood & Tissue Kit (Qiagen, 69504) as per the manufacturer’s instructions. Samples were prepared for exome sequencing analysis, using the Twist NGS Workflow with library preparation conducted using the Library Preparation EF 2.0 protocol followed by multiplex sample preparation (8-plex) with Twist Target Enrichment hybridisation protocol (both Twist Bioscience, 104206 and 101279, respectively). Exome capture was achieved using the Twist Mouse Exome Panel giving a coverage of 37.7 Mb (genome build mm10) (Twist Bioscience). The resultant multiplexed libraries were then sequenced on an Illumina NextSeq 500 desktop sequencer at paired-end read length of 74 bp.
RNA extraction and sample processing for sequencing
RNA from sorted cell populations was isolated using the RNeasy Micro Kit (Qiagen, 74004) according to the manufacturer’s protocol. RNA from cultured organoids or from tumors was isolated using the RNeasy mini kit (Qiagen, 74104) as per the manufacturer’s instructions. The quality of the purified RNA was tested on an Agilent 2200 Tapestation using RNA screen tape, and samples with RIN values>7 were used. Libraries for cluster generation and DNA sequencing were prepared following an adapted method from the Illumina TruSeq RNA LT Kit. The quality and quantity of the DNA libraries was assessed on an Agilent 2200 TapeStation (using Agilent D1000 ScreenTape) and a Qubit fluorometer (Thermo Fisher Scientific) respectively. The libraries were run on the Illumina NextSeq 500 using the High Output 75 cycles kit (2×36 cycles, paired-end reads, single index).
For sorted cells, RNA quality and quantity was checked on an Agilent Bioanalyzer 2100 using an RNA Pico 6000 chip. Libraries for cluster generation and DNA sequencing were prepared following the TaKaRa SMARTer Stranded Total RNA-Seq Kit–Pico Input Mammalian v2 protocol.
Single-cell RNA sequencing
Single-cell suspensions were stained with DAPI (Invitrogen, D1306) and live cells were sorted by FACS. Live cells were processed through the 10x Genomics Chromium controller using the Single Cell Gene Expression kit (10x Genomics, Chromium Next GEM Single Cell 3’ Kit v3.1) to generate emulsions which were first reverse-transcribed and then PCR-amplified to generate cDNA. Sequencing libraries were then generated using 10 µl of cDNA as outlined in the 10x Genomics protocol (CG000315 Rev C). Briefly, cDNA was first fragmented, end-repaired and adaptors ligated, followed by PCR amplification and size selection to generate final libraries. The libraries were analyzed using the Bioanalyzer High Sensitivity DNA Kit (Agilent Technologies). scRNA-Seq libraries were sequenced on the NovaSeq S4 flow cell (Illumina) with paired-end 150-base reads to a depth of 25,000 reads per cell.
RNA sequencing
RNA sequencing was performed using an Illumina TruSeq RNA sample prep kit, then run on an Illumina NextSeq using the High Output 75 cycles kit (2 × 36 cycles, paired-end reads, single index). Illumina data were demultiplexed using bcl2fastq v2.19.0, and adaptor sequences were removed using Cutadapt v0.6.4. Raw sequence quality was assessed using the FastQC algorithm v0.11.8. Sequences were trimmed to remove adaptor sequences and low-quality base calls, defined by a Phred score of <20, using the Trim Galore tool (v0.6.4). The trimmed sequences were aligned to the mouse genome build GRCm38.98 using HISAT2 (v2.1.0). Raw counts per gene were then determined using FeatureCounts (v1.6.4). Differential expression analysis, principal component, gene set enrichment and cell population estimates were performed using MouSR (v1.0, available at https://mousr.qub.ac.uk/37). Mouse CMS classes were assigned using the MmCMS package (v0.1.0, available at https://github.com/MolecularPathologyLab/MmCMS 23).
Single-cell RNA sequencing data curation
Sequence alignment of single-cell data to the mm10 genome was performed using the count tool from the Cell Ranger package (v6.1.2) according to the developers’ instructions, generating barcodes, features and matrix output files for each sample. Subsequent analysis was performed using R (v4.1.1) and Seurat (v4.0.4). Samples were input using the Read10X function, filtering to include cells with a minimum of 100 expressed genes and genes that are present in at least three cells, then further filtered to only include cells with <5% mitochondrial genes, <10% hemoglobin genes, >100 genes/cell, and >400 reads/cell. Samples were then integrated by RPCA using the IntegrateData function before being scaled and normalised. Dimension reduction was then performed using principal component analysis (PCA) before clustering using the FindNeighbours and FindClusters functions. Marker genes for individual clusters were determined using the FindAllMarkers function. Cell types were annotated using CellTypist and custom gene lists.
Exome sequencing
After sequencing samples on NextSeq, quality metrics were assessed using FastQC (v0.11.8), bedtools (v2.21.0) and Exome CQA. Sequence alignment to mouse genome version mm10 was performed using BWA-MEM. Aligned BAM files were sorted and indexed with samtools (v1.9) before duplicate reads were marked using the MarkDuplicates function from Picard tools (v2.21.1). Next, base recalibration was performed using the BaseRecalibrator and ApplyBQSR functions from the GATK4 suite (v4.1.8.1), using annotated SNP and INDEL sites from C57BL/6NJ mice obtained from the Mouse Genome Project (Wellcome Sanger institute) as known sites. Variant calling was then performed using the Mutect2 function from the GATK4 suite, using the tumor-only workflow and a panel of normals created from C57BL/6NJ variant calls obtained from the Mouse Genome Project as above. Variants were then filtered using the FilterMutectCalls function from GATK4 and annotated using snpEff with GRCm38.99 annotations.
Microsatellite instability (MSI) was assessed with MSIsensor-pro58 (v1.0.a), using the scan function to obtain microsatellite information, the baseline function to calibrate the microsatellite sites with matched normal samples, and the pro function to evaluate microsatellite instability.
PN and PNCCR signatures
PN and PNCCR signatures were generated from sorted macrophages by quantifying differential gene expression data from PNCCR vs PN tumors using the DESeq2 R package. The PN signature was formed of genes downregulated in PNCCR macrophages with base mean>100, log2 fold<-1.5, adjusted p-value<0.05 and known to be expressed by macrophages based on the tissue expression database (https://tissues.jensenlab.org/). The PNCCR signature was formed of genes upregulated in PNCCR macrophages with log2 fold>1.5, base mean>300 and adjusted p-value<0.05. Genes forming each signature are listed in Figures 8B and 8C.
Differential gene expression across normal, tumor, and metastatic CRC-patient tissue
Bulk microarray-derived gene expression data from an existing TNM dataset59 were derived for human patients’/subjects’ normal colon tissue (n=377), primary colorectal tumor tissue (n=1450), or metastatic lesions of CRC (n=99) and processed via their default computational workflow with ggplot2 R package accessed via a Shiny R package (v1.7.4) utilizing shinythemes (v1.2.0) and shinycssloaders R packages (v1.0.0). The TNM dataset consisted of integrated microarray data derived from a standard database such as Gene Expression Omnibus of the National Center for Biotechnology Information (NCBI-GEO), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and The Genotype-Tissue Expression (GTEx) repositories. All settings were pre-defined and accessed via default workflows with no further modifications or exclusions of data60. Here, the aforementioned PN and PNCCR macrophage signatures were assembled as a metagene (average expression of all genes per signature) and plotted as a box plot to compare expression levels of this metagene across normal colon tissue, primary colorectal tumor tissue and metastasis lesions.
Enrichment of the PN-MAC signature in patient cohorts
Gene symbols and Entrez IDs were matched using org.Hs.eg.db R package (v3.16.0). Thereafter, CMS classification was performed via the RF method using CMSclassifier R package (v1.0.0). To determine the degree of enrichment of the PN-MAC and CMS signatures in individual patient samples, single-sample GSEA (ssGSEA) was performed using GSVA (v1.44.5). Correlation analyses were performed using the ‘stat_cor’ function from the ggpubr package (v0.6.0) with the ’Pearson’ method. The default setting was used for all other parameters. All statistical analyses were performed using R (v4.2.1) and RStudio (v 2022.7.2.576).
Human single-cell data analyses
The processed count expression matrix and the corresponding cell-type annotation file were downloaded from the Gene Expression Omnibus (GSE132465 for all Samsung Medical Center tumor (SMC-T) samples). All tumor cells (n=47285) were normalized using the NormalizeData function in Seurat R package (v4.3.0) with normalization.method=LogNormalize.
For this analysis, the tumor myeloid populations (n=6400 cells) were extracted and cDC and unknown clusters were removed from the population. To ensure that the remaining population (n=5586 cells) only comprised macrophages, we applied another layer of confidence by filtering in only cells that expressed both CD68 and CSF1R (n=1964 cells).
For clustering, the top 2000 variably expressed genes were selected using the FindVariableFeatures function in the Seurat R package with selection.method = "vst". Principal component analysis was performed using the variably expressed genes. Fifteen principal components were selected from the knee of the scree plot. After PCA as an initial step to reduce the dimensionality of the scRNA-seq data, Uniform manifold approximation and projection (UMAP) non-linear dimensionality reduction technique was applied using RUNUMAP function in Seurat package to visualize the scRNA-seq data in a two-dimensional space.
The FindNeighbors function was used to compute the nearest neighbors and FindClusters was used to identify clusters within the macrophage population.
Single cells were annotated by CMS calls according to the CMS bulk subtyping provided for the corresponding patients.
QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical analyses were performed using GraphPad Prism software (version 9.5.0., GraphPad Software, Inc.) and R (version 4.2.0.) with tests indicated in the figure legends.
In summary, Kaplan–Meier survival curves were used to analyze survival post tamoxifen-induced Cre-mediated recombination or drug treatments, and log-rank Mantel–Cox tests were used to compare survival between cohorts of different genotypes or between vehicle- and drug-treated counterparts. The Shapiro–Wilk normality test was used to ascertain a normal distribution of the data. The two-tailed unpaired t-test was then used to compare data between two cohorts or treatments. For data that were not normally distributed (i.e., that failed the Shapiro–Wilk normality test), Mann-Whitney test was used. When comparing more than 2 cohorts or treatments that are normally distributed, one-way ANOVA test was performed followed by Fisher’s LSD multiple comparisons. When comparing more than 2 cohorts or treatments that were not normally distributed, the Kruskal–Wallis test was performed followed by Dunn’s test for multiple comparisons. To determine whether there was a significant association between genotype or treatment and incidence of metastasis, Fisher’s exact test was used. Tests were considered statistically significant at p<0.05. Exact p values are indicated in either on the figure panels or in the figure legends.