Patient characteristics
Among six primary breast cancer patients, 3 were clinically triple-negative subtypes (Case ID. NEO 1,2,6; Table 1), a type of breast cancer with negative expression of estrogen, progesterone, and human epidermal growth factor receptor-2 (HER2), and 3 were hormone-receptor-positive breast cancer with luminal B subtype with high Ki67 labeling index (Case ID, NEO 3,4,5; Table 1). In hormone receptor-positive breast cancer patients, 2 showed invasive lobular carcinoma histology. At the time of sample collection, patients were not undergoing therapy except one patient (NEO2). NEO2 patients had received neoadjuvant chemotherapy (AC [doxorubicin/cyclophosphamide]) followed by a weekly paclitaxel regimen but withdrew neoadjuvant chemotherapy after one paclitaxel infusion due to early disease progression and underwent surgery.
Two PDXs established from surgical specimens from squamous cell carcinoma of the lung (AV10 and AV38) and one from non-small-cell lung cancer (AV10) were used to represent lung cancer (Table 1).
Table 1
Case ID | Age | Histologic Type | Histologic Grade | Ki67 (%) | Sample source | Subtype | Stage | HLA type |
NEO1 | 60 | IDC | 3 | 77.2 | Surgery | TNBC | Stage IIA (pT1N1M0) | A*24:02, A*33:03 |
NEO2 | 47 | IDC | 3 | 66.1 | Surgery | TNBC | ypStage IA (ypT1N0M0) | A*24:02, A*33:03 |
NEO3 | 74 | IDC | 2 | 63.4 | Surgery | Luminal B BC | Stage IIA (pT2N0M0) | A*02:06, A*11:01 |
NEO4 | 52 | ILC | 2 | 55.6 | Surgery | Luminal B BC | Stage IIA (pT2N0M0) | A*02:06, A*11:01 |
NEO5 | 52 | ILC | 2 | 47.2 | Surgery | Luminal B BC | Stage IIA (pT2N0M0) | A*02:01, A*02:01 |
NEO6 | 74 | IDC | 3 | 82.3 | Surgery | TNBC | Stage IIA (pT2N0M0) | A*02:01, A*24:02 |
AV10 | NA | Squamous cell carcinoma | NA | NA | PDX | SQCC | stage IIB (T2bN1M0) | A*32:01, A*33:03 |
AV13 | NA | Non-small cell | NA | NA | PDX | NSCLC | stage IIIA (T2bN2 M) | A*02:01, A*30:04 |
AV38 | NA | Squamous cell carcinoma | NA | NA | PDX | SQCC | stage 1b | A*01:01, A*31:01 |
IDC; invasive ductal carcinoma, ILC; invasive lobular carcinoma, TNBC; triple-negative breast cancer, BC; breast cancer, SQCC; squamous cell carcinoma, NSCLC; non-small cell carcinoma of the lung, PDX; patient-derived tumor xenograft |
Comparison of HLA genotyping methods
We used Sanger sequencing-based clinical HLA genotyping results with two-digit resolution (BIOWITHUS Inc. Seoul, Korea). In addition, we applied Optitype 29 and HLAminer 30 and used the Omixon Holotype HLATM kit (Omixon Biocomputing Ltd, Budapest, Hungary) to evaluate NGS-based HLA typing methods (Table 2).
Table 2
HLA genotyping results comparison among three different methods.
Case ID | HLA-A genotype (clinical) | OptiType | HLAminer | Omixon Holotype |
NEO1 | A*24:02, A*33:03 | A*24:02, A*33:03 | A*24:02, A*33:03 | A*24:02, A*33:03 |
NEO2 | A*24:02, A*33:03 | A*24:02, A*33:03 | A*24:02, A*33:03 | A*24:02, A*33:03 |
NEO3 | A*02:06, A*11:01 | A*02:06, A*11:01 | A*02:06, A*11:01 | A*02:06, A*11:01 |
NEO4 | A*02:06, A*11:01 | A*02:06, A*11:01 | A*02:06, A*11:01 | A*02:06, A*11:01 |
NEO5 | A*02:01, A*02:01 | A*02:01, A*02:01 | A*02:01, A*02:01 | A*02:01, A*02:01 |
NEO6 | A*02:01, A*24:02 | A*02:01, A*24:02 | A*24:95, A*69:01 | A*02:01, A*24:02 |
AV10 | A*32:01, A*33:03 | A*33:03, A*33:03 | A*33:147, NA | ND |
AV13 | A*02:01, A*30:04 | A*02:01, A*30:04 | A*30:04, A*69:01 | ND |
AV38 | A*01:01, A*31:01 | A*01:01, A*31:01 | A*01:01, A*36:03 | ND |
Average accuracy | reference | 17/18 (94%) | 11/18 (61%) | 12/12 (100%) |
Note: bold words represent mismatches |
There was a high concordance between NGS-based Optitype and clinical HLA genotyping tests with 94% agreement (Supplementary Table 1). However, the concordance rate for HLAminer was low (61%). Omixon Holotype assay results were 100% concordant with clinical genotypes. |
These results suggest that HLA typing with the OptiType algorithm for exome or RNAseq data are good alternatives for clinical HLA typing tests when analyzing candidate neoepitope selection, without the need to use costly dedicated kits for HLA genotyping. |
Candidate neoepitope selection
Paired whole-exome sequencing of tumor and PBMC DNA yielded an average of 95 (range 70 to 107) nonsynonymous somatic mutations in breast cancer tumors and an average of 109 (range 76 to 143) nonsynonymous mutations in lung cancer PDX samples.
In this study, we focused only on HLA-A allele-restricted neoepitopes for experimental validation. For each tumor sample, all candidate epitopes predicted by Neopepsee and pVACseq are listed in Supplementary Table 1, with summarized ELISpot results presented in Table 3. Neopepsee considers only SNVs and does not consider fusion genes or intron retention. Neopepsee selected an average of 19 candidates (range 7 to 33) with breast cancer mutations and 17 candidates (range 11 to 21) with lung cancer mutations. For pVACseq analysis with a modified threshold of IC50 below 100 nM, DNA VAF >20 and RNA VAF >0 yielded an average of 9 best candidates for breast cancer (range 3 to 26) and 15 for lung cancer PDX samples (range 8 to 21).
In total, Neopepsee predicted 159 neoepitope candidates from 898 mutations (17.7%), and pVACseq predicted 84 (9.4%), with only 26 (2.9%) shared between the two prediction algorithms. Thus, the ensemble of Neopepsee and pVACseq identified 217 candidate neoepitopes (24.2%) from 898 SNVs.
In vitro validation of candidate neoepitopes
Based on a report by Stronen et al,25,26 we used HLA-matched donor blood for an ELISpot assay of IFN-Ɣ secretion from neoantigen-specific T cell populations to validate candidate neoepitopes.25,26 Among 217 candidate neoepitopes, 22 Neopepsee candidate peptides and 9 pVACseq candidate peptides could not be tested either due to synthesis or purification failure or due to lack of donor blood for the specific HLA alleles. In total, 36 of 191 tested candidates (18.8%) were positive by ELISpot. The results summarized by case and prediction algorithms are provided in Table 3, with detailed information for the immunogenic peptides provided in Table 4. For some candidate peptides, although no memory response could be demonstrated, there were higher numbers of spots compared to the wells not stimulated with the peptide from day 0. We did not consider those peptides to be positive, although some reports in the literature considered such results to be immunogenic. For some peptides, there were strong responses in only one of the three donor blood samples, suggesting that the response could be directed toward the allele other than the predicted allele.
For our primary aim, we tested whether we could identify immunogenic neoepitopes from ER+ BC with ER- BC and lung cancers as positive controls.
The Ensemble of Neopepsee and pVACseq predicted 93 neoepitopes from 299 somatic mutations in three ER+ BC patients. Among them, 90 could be tested with ELISpot, and 14 (15.6%) were immunogenic (1, 5, and 10 for each tumor). In three ER- BC patients, 52 neoepitopes were predicted from 271 mutations, and 12 (25.0%) of 48 tested were immunogenic (2, 4, and 8 for each tumor). From three lung cancer PDXs, 53 from 72 predicted neoepitope candidates were tested, and 10 of them were immunogenic (18.9%) (0, 1, and 11 for each tumor). These differences were not statistically significant. Therefore, we conclude that ER+ luminal B BCs express immunogenic neoepitopes, although their numbers vary widely between individual tumors.
Table 3
Summary of ELISpot assays of IFN-Ɣ secretion from neoantigen-specific T cell populations from donor PMBCs.
| HLA genotype | N of Mutations | NeoPepsee | pVACseq (modified)* | Common to both |
| | | predicted | tested | ELISpotpositive | PPV | predicted | tested | ELISPOT positive | PPV | predicted | tested | ELISpot positive | PPV |
NEO1 | A*24:02, A*33:03 | 70 | 9 | 9 | 3 | 0.33 | 4 | 4 | 1 | 0.25 | 3 | 3 | 1 | 0.33 |
NEO2 | A*24:02, A*33:03 | 94 | 7 | 6 | 2 | 0.33 | 7 | 6 | 0 | 0.00 | 1 | 0 | 0 | |
NEO3 | A*02:06, A*11:01 | 103 | 28 | 27 | 6 | 0.22 | 26 | 25 | 4 | 0.16 | 11 | 10 | 2 | 0.20 |
NEO4 | A*02:06, A*11:01 | 96 | 33 | 31 | 1 | 0.03 | 3 | 3 | 0 | 0.00 | 1 | 1 | 0 | 0.00 |
NEO5 | A*02:01, A*02:01 | 100 | 12 | 12 | 4 | 0.33 | 5 | 5 | 2 | 0.40 | 2 | 2 | 1 | 0.50 |
NEO6 | A*02:01, A*24:02 | 107 | 24 | 22 | 6 | 0.27 | 9 | 7 | 2 | 0.29 | 4 | 3 | 1 | 0.33 |
AV10 | A*32:01, A*33:03 | 143 | 11 | 3 | 0 | 0.00 | 5 | 3 | 0 | 0.00 | 0 | 0 | 0 | |
AV13 | A*02:01, A*30:04# | 109 | 21 | 17 | 6 | 0.35 | 15 | 13 | 5 | 0.38 | 4 | 2 | 2 | 1.00 |
AV38 | A*01:01#, A*31:01 | 76 | 14 | 10 | 0 | 0.00 | 10 | 9 | 1 | 0.11 | 0 | 0 | 0 | |
Total | | 898 | 159 | 137 | 28 | 0.20 | 84 | 75 | 15 | 0.20 | 26 | 21 | 7 | 0.33 |
#Note: A*30:04 and A*01:01 matched peptides were not tested due to lack of donor blood |
Table 4
List of ELISpot-positive cancer-specific neoepitopes.
case | HLA Allele | Prediction algorithm | Candidate neoepitope (9 or 10mer) | Wild type sequence | Gene Symbol | NetMHCpan binding affinity mutant | NetMHCpan binding affinity Wild type | BindLevel |
Neo1 | A*24:02 | N | SYGRLMFFC | SHGRLMFFC | CPA6 | 3276.94 | 24435.71 | |
Neo1 | A*24:02 | N | RFIPGSSLL | RFIRGSSLL | ZNF517 | 126.35 | 173.14 | SB |
Neo1 | A*33:03 | NP | IYFLIGTSR | IYFLMGTSR | SLC26A1 | 40.73 | 55.21 | SB |
Neo2 | A*24:02 | N | CYKMIGLTI | CYIMIGLTI | FAM162B | 162.07 | 142.00 | WB |
Neo2 | A*24:02 | N | RYLQLQLHL | RYLQLQLYL | IKZF4 | 31.80 | 29.17 | SB |
Neo3 | A*11:01 | N | IAYNLYLIY | IAYNLSLIY | GTF3C3 | 195.64 | 115.19 | WB |
Neo3 | A*11:01 | N | ASVRKKLGK | ASVRKKLGE | SPTBN4 | 114.48 | 25706.04 | SB |
Neo3 | A*02:06 | N | FLGSHLLHI | FLGSRLLHI | RTL1 | 11.48 | 33.03 | sB |
Neo3 | A*11:01 | N | PTTMPYPLK | PTTMTYPLK | VSIG4 | 495.86 | 589.41 | WB |
Neo3 | A*02:06 | NP | IILRALCAL | IILRAVCAL | SLFN5 | 56.88 | 134.81 | |
Neo3 | A*11:01 | NP | (A)TACWSGLFK | (A)TACWSGLCK | PRPF4 | 22.26 | 166.48 | WB |
Neo3 | A*02:06 | P | VLIKGSINSV | VLIEGSINSV | ARPC4 | 43.69 | 14.37 | SB |
Neo3 | A*11:01 | P | HSNRLAVAYK | HTNRLAVAYK | DMXL1 | 18.12 | 12.66 | WB |
Neo4 | A*02:06 | N | YQDNVTIFA | YQDNVTVFA | ABCA2 | 30.64 | 31.39 | SB |
Neo5 | A*02:01 | N | LLAYSEYNL | LLAYSEYNLP | USP6 | 33.2 | 1872.93 | WB |
Neo5 | A*02:01 | N | AINSYRFLV | AINYYRFLV | GPR68 | 64.63 | 92.90 | WB |
Neo5 | A*02:01 | N | KLQPFFEGM | KLKPFFEGM | PACS1 | 98.06 | 747.38 | SB |
Neo5 | A*02:01 | NP | HLLQCAWLEI | HLLECAWLEI | AK9 | 99.66 | 18.76 | |
Neo5 | A*02:01 | P | YMNAIKDYEL | YINAIKDYEL | PLEC | 8.21 | 106.30 | WB |
Neo6 | A*24:02 | N | YYQLFAATV | YYQLFAAAV | ANKRD53 | 45.53 | 285.69 | SB |
Neo6 | A*24:02 | N | IYETNVVGF | IYETNVLGF | TULP1 | 157.53 | 92.89 | SB |
Neo6 | A*24:02 | N | SFLKLAKLF | SFLKLAELF | SNX32 | 99.82 | 18.28 | SB |
Neo6 | A*02:01 | N | YIQTTTLPV | YIQTTTLTV | GRM6 | 15.88 | 44.45 | WB |
Neo6 | A*24:02 | N | YHIFFDQVF | YHIFFDKVF | PGBD2 | 812.59 | 2762.52 | WB |
Neo6 | A*24:02 | NP | AYLPWSYFL | AYLPWSYFP | C4orf33 | 9.19 | 1014.65 | SB |
Neo6 | A*02:01 | P | KLFSRNSGL | KFFSRNSGL | ZNF304 | 55.09 | 9702.91 | SB |
AV13 | A*02:01 | N | ALRRFAFMV | ALMRFAFMV | LPGAT1 | 171.47 | 3.52 | |
AV13 | A*02:01 | N | LLLFCDVGL | LLLFCDVDL | CHSY3 | 15.27 | 62.59 | WB |
AV13 | A*02:01 | N | AMTILILKV | AMTIWILKV | SGCZ | 60.49 | 42.39 | SB |
AV13 | A*02:01 | N | YLLMISALM | YLIMISALM | PSEN1 | 13.49 | 20.89 | WB |
AV13 | A*02:01 | NP | LLLTCGEKV | LLLTCGEEV | STC2 | 54.01 | 24.09 | WB |
AV13 | A*02:01 | NP | GVANCLFPL | GVGNCLFPL | METTL6 | 21.62 | 91.66 | WB |
AV13 | A*02:01 | P | FVIPEVFLKL | FVSPEVFLK | DNAAF5 | 64.82 | 10909.18 | WB |
AV13 | A*02:01 | P | FLLRGPPVPV | FLLRGPPGPV | C8orf82 | 4.60 | 6.90 | SB |
AV13 | A*02:01 | P | YLQRNAPTL | YLQRNALTL | KDM6A | 21.75 | 35.01 | SB |
AV38 | A*31:01 | P | RQDIDFGVSR | RQDIDLGVSR | NFE2L2 | 49.25 | 115.92 | SB |
Note: mutations and matching wild-type sequences are in bold character, N= predicted by NeoPepsee, P= predicted by pVACseq, NP=predicted by both NeoPepsee and pVACseq |
For our first secondary aim of comparing the positive predictive value of Neopepsee versus pVACseq, 28 of 137 tested Neopepsee candidates were ELISpot positive (PPV 0.20) compared to 15 of 75 tested pVACseq candidates (PPV 0.20). These PPVs were not significantly different (p=0.95). Among 21 tested candidates predicted by both algorithms, seven were positive by ELISpot (PPV 0.33). Therefore, we conclude that Neopepsee and pVACseq identify different pools of candidate neoepitopes from sequencing data and are complementary to each other, which in combination provide a reasonable number of candidates to screen for vaccine design, reducing the number of candidates that need to be screened experimentally from 8082 (898 times 9 possible positions for a mutation within a 9-mer nucleotide) to 217 with a median of 24 candidates per case (range 10 to 43) with a PPV of 18.8%. |
Evaluation of TESLA recommended criteria for neoepitopes
Previously, the Tumor Neoantigen Selection Alliance (TESLA) suggested that potential immunogenic peptides are characteristic of MHC binding affinity stronger than 34 nM14. According to these criteria, we compared the validated neopeptides based on an MHC binding affinity of 34 nM (Table 5).
Table 5
Summary of in vitro validated neopeptides by ELIspot assay according to MHC binding affinity of 34 nM.
| IC50 <34 nM | IC50 ≥34 nM | Marginal row total |
ELISpot positive | 14 (38.9%) | 22 (61.1%) | 36 |
ELISpot negative | 47 (30.3%) | 108 (69.7%) | 155 |
Marginal column total | 61 (31.9%) | 130 (68.1%) | 191 |
Fourteen of 36 (38.9%) ELISpot-positive peptides had NetMHCpan IC50 below 34 nM, whereas 47 of 155 (30.3%) ELISpot-negative peptides had IC50 below 34 nm (Fisher’s exact test, p=0.32, NS). If we were to predict based on the IC50 value, 61/191 (31.9%) peptides were predicted to be high-affinity binders, and among them, 14 (22.9%) were true positives. Therefore, we could not corroborate the findings from the TESLA consortium.