FDG-PET/CT and multimodal machine learning model prediction of pathological complete response to neoadjuvant chemotherapy in triple-negative breast cancer

doi:10.21203/rs.3.rs-5045559/v1

Download PDF

Research Article

FDG-PET/CT and multimodal machine learning model prediction of pathological complete response to neoadjuvant chemotherapy in triple-negative breast cancer

https://doi.org/10.21203/rs.3.rs-5045559/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background:

Triple-negative breast cancer (TNBC) is a biologically and clinically heterogeneous disease, associated with poorer outcomes when compared with other subtypes of breast cancer. Neoadjuvant chemotherapy (NAC) is often given before surgery and achieving pathological complete response (pCR) has been associated with patient outcomes. There is thus high clinical interest in the ability to predict pCR status using baseline data accurately.

Methods:

A cohort of 57 TNBC patients who had FDG-PET/CT before NAC was analyzed to develop a machine learning (ML) algorithm predictive of pCR. A total of 241 predictors were collected for each patient: 11 clinical features, 11 histo-pathological features, 13 genomic features, and 206 PET features, including 195 radiomics features. The optimization criterion was the Area Under the ROC Curve (AUC). Event-free survival (EFS) was estimated using the Kaplan-Meier method.

Results:

The best ML algorithm reaching an AUC of 0.82. The features with the highest weight in the algorithm were a mix of PET (including radiomics), histo-pathological, genomics, and clinical features, highlighting the importance of truly multimodal analysis. Patients with predicted pCR tended to have better EFS than patients with predicted non-pCR, even though this difference was not significant probably due to small sample size and few events observed (P=0.09).

Conclusion:

The study suggests that ML applied to baseline multimodal data can help predict pCR status after NAC for TNBC patients and seem correlated to long-term outcomes. Patients that would be predicted as non-pCR could benefit from concomitant treatment with immunotherapy or dose intensification.

FDG-PET/CT

triple-negative breast cancer

neoadjuvant chemotherapy

artificial intelligence

machine learning

radiomics

metabolic response

pCR

prognosis

Many patients with stage II-III breast cancer (BC) receive neoadjuvant chemotherapy (NAC) (1). This strategy allows more patients to undergo breast-conserving surgery (BCS) and increases the chances of surgery in patients with primary inoperable disease; it also provides information on the efficacy of chemotherapy (2). Pathological complete response (pCR) after NAC is a strong predictor of favorable outcomes, especially in aggressive breast cancer subtypes such as HER2-positive breast cancer and triple negative breast cancer (TNBC; lacking estrogen receptors (ER), progesterone receptors (PR) and without HER2-overexpression) (3,4). However, the pathological response is only known at the end of NAC so earlier detection of treatment response would lead to treatment adaptation to increase the pCR rate in non-responders (5). Positron emission tomography/computed tomography (PET/CT) with ¹⁸F-fluorodeoxyglucose (FDG) has shown potential to detect residual disease early (after one or two cycles) and also to predict poor outcomes in TNBC patients (6–10). The PET image-derived parameter used in most studies was the decrease in the maximum standard uptake value (SUV_max) under therapy. Unfortunately, the percentage change in FDG uptake that was considered discriminant varied dramatically across studies, which from refrained translation of the technique in clinical practice (11). Moreover, this method requires two sequential PET to measure an early change in the standardized uptake value (ΔSUV) (11) and this is mostly not a standard of care (12).

Various clinical, histo-pathological, and biological breast cancer characteristics assessed before treatment are well-known as prognostic factors. In particular, high-grade tumors are more proliferative and aggressive than low grade tumors and more prone to respond to chemotherapy. However, tumor grade alone has some limitations in predicting response to treatment, especially in the case of TNBC, where most of these tumors are of high grade. Thus, the genomic grade index (GGI) was developed to improve BC grading and its prognostic value (13) and we demonstrate that the prediction of response to NAC increased when baseline SUV_max was combined with the GGI, in a previous study of TNBC patients (14). By optimizing data from medical imaging, radiomics has also shown the ability to predict response to neoadjuvant chemotherapy for breast cancer (15). Artificial intelligence (AI) with machine learning (ML) techniques and deep learning (DL), a subset of ML, have also recently shown the ability to improve breast imaging performances (16), including therapeutic prediction through PET imaging (17). In that context, our study’s main objective was to develop a multimodal machine learning-based algorithm predictive of pCR using clinical, histopathological, genomics and PET-imaging (including radiomics) pretreatment data. This should make it possible to predict response to NAC on the basis of data available at diagnosis for any patient undergoing treatment for breast cancer.

Study design

The present study was designed to evaluate the predictive value of a combination of parameters based on clinical data (e.g., age, family history of breast cancer, clinical T-stage, clinical N-stage, unifocal or multifocal tumor), histopathological findings (grade, Ki-67 IHC expression), molecular markers measured on pretreatment biopsy (e.g., GGIr, or isolated component of the GGIr: CDC2, CDC20, KPNA2, and MYBL2 gene expressions), and FDG-PET/CT imaging features (e.g., tumor SUV_max, lymph node SUV_max, metabolic tumor volume, radiomics features...). All features were collected in a large database to be processed by a machine learning algorithm.

Eligibility criteria were patients with stage II-III triple negative breast cancers (TNBC) scheduled for neoadjuvant chemotherapy, and with a baseline PET. Patients with distant metastases and patients with bilateral cancer were not included.

We first determined the ability of these parameters alone or in combination, to predict pCR (primary endpoint). Then, we tested if the predicted pCR was a surrogate marker of patient’s outcome (secondary endpoint).

The Institutional Review Board approved the study and stated that no informed consent was needed, considering the non-interventional design of this retrospective analysis (IRB # 00003835, French ethics committee Paris-Saint-Louis, # 2013-27NICB; NCT02600442).

Histo-pathological features and gene expression profiling

Breast cancers were diagnosed based on a ultrasound-guided core-needle biopsy. An experienced pathologist determined tumor type and histological grade was determined using the modified Scarff-Bloom-Richardson (SBR) grading for invasive carcinoma.

Tumors were defined as triple negative based on immuno-histochemical staining, using specific antibodies and an automated immunostainer (Ventana XT; Tucson, AZ, USA). Tumors were considered ER and PR negative if less than 10% of tumor cells expressed ER and PR. HER2 was over-expressed if HER2 immunostaining was uniform, with intense membrane staining of >30% of invasive tumor cells, following the recommendations of the period of analysis.

Ki67 score was analyzed by immunohistochemistry using MIB-1 antibody (Dako, Glostrup, Denmark) and quantified automatically by the image analysis software Hamamatsu NDP Analyze. The threshold for Ki67 positivity was 14% stained cells, whatever the staining intensity (18).

Total RNA extracted from frozen biopsy was used for molecular analysis. TP53 functional status was determined using a highly efficient yeast functional assay (FASAY) as previously published (19).

Gene expression analysis of Ki67, CDC2, CDC20, KPNA2 and MYBL2 were performed by RT-quantitative PCR. GGIr score assessment was obtained by combining the expression of the 4 genes (CDC2, CDC20, KPNA2 and MYBL2), covering all cell cycle phases as described (20). We analyzed the predictive value of Ki67 mRNA expression and of the reduced Genomic Grade Index (GGIr) as a continuous value for the association with pCR.

FDG-PET/CT Imaging acquisition

Patients fasted for 6 hours, and blood glucose level had to be less than 7 mmol/L. FDG (5 MBq/kg) was administered and imaging (from mid-thigh level to the base of the skull with the arms raised) started almost 60 minutes later. The Gemini XL PET/CT scanner (Philips Medical systems) was used. CT data was acquired first (120 kV; 100 mAs; no contrast-enhancement). PET emission data was acquired in a 3-dimensional mode, with 2 min. per bed position. The attenuation-corrected images were normalized for injected dose and body weight, and subsequently converted into Standardized Uptake Values (SUV), defined as: [tracer concentration (kBq/mL)] / [injected activity (kBq)/patient body weight (g)]. SUV_maxwas measured in the primary tumor and in the axillary lymph nodes if present.

Imaging processing and radiomics features

From the baseline PET/CT imaging data, the primary breast tumor of each patient was segmented in 3D through a semi-automatic segmentation method using 42% of SUV_max (Figure 1). The segmentation was performed by an experimented nuclear physician using the SOPHiA DDM for radiomics platform (Research Use Only; SOPHiA GENETICS SA; Switzerland). Radiomics features describing the tumor through its size, shape, voxel intensity distribution, and texture were then extracted following the IBSI standards (21). Metabolic tumor volume (MTV) was determined using the SOPHiA platform.

Neoadjuvant Chemotherapy Regimen

Some patients (the oldest treated) received EC-D (4 cycles of Epirubicin 75 mg/m² d1 plusCyclophosphamide 750 mg/m² d1 administered every 3 weeks, followed by 4 cycles ofDocetaxel 100 mg/m² d1 qw3). Patients from the more recent period received Epirubicin 75 mg/m² d1 plusCyclophosphamide 1200 mg/m² d1 every 2 weeks (SIM) for 6 cycles. After surgery, patients who received SIM chemotherapy received 3 cycles of Docetaxel (75mg/m2 d1 plus cyclophosphamide 750mg/m2 d1) every 3 weeks. The shift towards the use of dose dense, dose intense cyclophosphamide-anthracyclins (SIM) in the treatment of TNBC patients at Saint-Louis hospital, aimed at increasing pCR rates based on our previous data (22).

Pathology Assessment, follow-up and Event-free Survival

Pathologic complete response (pCR) was defined as no evidence of residual invasive cancer in breast tissues and lymph nodes (4). Absence of carcinoma in situ was not mandatory.

During neoadjuvant chemotherapy, patients underwent clinical examination every two cycles. After surgery, patients had follow-up visits every 4 months for two years, then twice yearly. Events included local, regional, or distant recurrences or death. Event-free survival (EFS) was defined as the period between the date of surgery and the date of the first event or the last follow-up.

Multimodal Data Aggregation

Clinical, histo-pathological, genomic, PET and radiomics features were aggregated in a large database. Categorical features were one-hot-encoded and numerical features were standardized to achieve a null mean and unit variance. A batch-effect correction for genomic expression data was achieved, inspired by the mean-only ComBat adjustment approach (23). The association between each feature independently with pCR outcome was assessed using non-parametric tests (Wilcoxon rank-sum test for continuous covariates, Fisher’s exact test for binary covariates).

A hierarchical clustering method was applied to reduce the number of radiomic features, with a bootstrap approach to determine the optimal number of clusters given the stability of the partitions. Six groups of radiomic features were defined, and only one feature per group was selected. Relevant non-radiomic features were selected based on their completion rate (less than 50%) and univariable feature-outcome analyses combined with clinical expertise. A single imputation by chained equations was then carried out to handle missing values in predictors using the MICE algorithm with randomized decision trees and 10 iterations (24).

Machine Learning Model Development

Several machine learning algorithms were then trained, including logistic regression models (with either LASSO, Ridge, or Elastic-net regularization), binary decision trees, support vector machines (with linear kernel) and random forests. Due to the small cohort size, a nested leave-pair-out cross-validation (LPOCV) approach (60 random pairs in the inner resampling, 756 (all) random pairs in the outer resampling) was used to correctly estimate the predictive performance of the models and select the best one (25). A grid search was applied with the Area Under the ROC Curve (AUC) as optimization criterion. Additional Materials give the grid of hyperparameters which was explored for each model. The uncertainty of the estimated predictive metrics (95% confidence intervals) and the comparison of performances obtained using various sets of data modalities (P-values) were quantified through a 10000-sample bootstrapping with outcome stratification over the pairs of patients used for the nested LPOCV.

The best ML model, according to the estimated predictive performances, was finally trained using a “standard” leave-pair-out cross-validation procedure. Global interpretability tools ensured a correct understanding, validation, and justification of the prediction model. In addition, Shapley Additive exPlanations (SHAP) values enabled to explain each patient-specific predicted probability of non-pCR.

EFS was estimated using the Kaplan-Meier method. The log-rank test was used to compare EFS among patients predicted pCR vs. non-pCR by the best ML model.

All statistical analyses were implemented in Python (version 3.8.0), using the scikit-learn (version 1.1.1) and lifelines (version 0.27.7) libraries.

Patient’s characteristics and building of the database for machine learning

The study enrolled 57 patients with stage II or III TNBC treated in the neoadjuvant setting between 2008 and 2015 at the Saint-Louis hospital. No patients had distant metastases on pretherapeutic PET/CT. Lymph nodes involvement was clinically apparent in 57.9% of patients (Table 1). Most tumors were of no specific histological type (52 breast carcinoma of non-specific type and 5 metaplastic carcinoma) and grade-3 (91.2%). At baseline, median tumor SUV_max was 10 (min = 3; max = 31.4) and median MTV was 7.9 cm³ (Table 1).

Table 1

**Overall Characteristics of the 57 Triple Negative Breast Cancer Patients**
Patients Characteristic (N = 57)	Summary∗
Age in years	54.0 (45.0, 64.0)
Family history of breast cancer No Yes Missing	46 (80.7) 10 (17.5) 1 (1.8)
Clinical T-stage∗∗ T1-T2 T3-T4	27 (47.4) 30 (52.6)
Clinical N-stage∗∗ N0 N+	24 (42.1) 33 (57.9)
Histological type Non specific Metaplastic	52 (91.2) 5 (8.8)
Histological grade Grade-1-2 Grade-3	5 (8.8) 52 (91.2)
P53 mutation Wild type Mutated	5 (8.8) 52 (91.2)
Ki-67 mRNA expression (x1000) Missing	533.1 (245.0, 781.6) 13
GGIr† (x1000)	391.1 (172.2, 598.4)
Tumor SUV_max	10.0 (7.2, 14.8)
Metabolic tumor volume (cm³)	7.9 (3.8, 18.9)
Chemotherapy regimen EC-D SIM	8 (14.0) 49 (86.0)
Surgery Breast-conserving surgery Mastectomy No surgery	28 (49.1) 28 (49.1) 1 (1.8)
Pathological findings pCR non pCR	21 (36.8) 36 (63.2)
∗ Continuous data: Median (Q1: first quartile, Q3: third quartile); Categorical data: Amount (Percentage)
∗∗ Clinical classification before FDG-PET/CT according to the eighth edition of the AJCC Staging Manual.
† GGIr: reduced genomic grade index (calculated after batch-effect correction)
EC-D, sequential regimen of 4 cycles of Epirubicin 75 mg/m² plus Cyclophosphamide 750 mg/m² followed by 4 courses of Docetaxel 100 mg/m²; SIM, intensified regimen of Epirubicin 75 mg/m² plus Cyclophosphamide 1200 mg/m² for 6 cycles; DC, 6 cycles of Docetacel + Cyclophosphamide; pCR, Pathological Complete Response.

Eight patients received EC-D, while 49 were treated with the SIM protocol. Surgery was performed after NAC in all patients (28 breast-conserving surgery and 28 mastectomies) except in one case, showing clinical progression during neoadjuvant treatment.

The multimodal pretreatment data were aggregated, resulting in a total of 241 predictors collected for each patient (Fig. 2): 11 clinical features, 11 histo-pathological features, 13 genomic features, and 206 PET features, including 195 radiomics features (20 describing the tumor size, 13 characterizing its shape, 76 describing its voxel intensity distribution, and 86 its texture).

Association between main features and pCR (univariate analysis)

More than half of the patients had no pCR (36 of 57 patients; 63.2%). Table 2 shows independent association of main features with the pathological findings.

Table 2

**Association between the main features and pCR**
Modality	Feature^∗	non-pCR (N = 36, 63.2%)	pCR (N = 21, 36.8%)	P-value
Clinical	Age in years (Q1, Q3)	54.0 (42.3, 63.0)	54.0 (49.0, 64.0)	0.73
	Family history of breast cancer No Yes Missing	27 (75.0) 9 (25.0) 0	19 (95.0) 1 (5.0) 1	0.08
	Contraception No Yes Missing	14 (38.9) 22 (61.1) 0	12 (60) 8 (40) 1	0.17
	Clinical T-stage^∗ T1-T2 T3-T4	13 (36.1) 23 (63.9)	14 (66.7) 7 (33.3)	0.03
	Clinical N-stage^∗ N0 N+	16 (44.4) 20 (55.6)	8 (38.1) 13 (61.9)	0.78
	Chemotherapy regimen EC-D SIM	5 (13.9) 31 (86.1)	3 (14.2) 18 (85.8)	1.00
	Surgery Breast-conserving surgery Mastectomy No surgery	15 (42.9) 20 (57.1) 1	13 (61.9) 8 (38.1) 0	0.27
Histopathological	Histological type Non specific Metaplasic	32 (88.9) 4 (11.1)	20 (95.2) 1 (4.8)	0.64
	Histological grade Grade-1-2 Grade-3	5 (13.9) 31 (86.1)	0 (0.0) 21 (100.0)	0.15
	Mitoses count	20 (3, 30)	13.5 (3.7, 26.2)	0.83
	Ki67 score (Automate)	35 (9, 64.5)	42 (31.2, 59.5)	0.35
Gene expression profiling	Ki-67 mRNA expression Missing	482.1 (280.8, 631.2)	579.0 (179.1, 862.6)	0.38
	P53 mutation Wild type Mutated	2 (5.6) 34 (94.4)	3 (14.3) 18 (85.7)	0.35
	CDC2 (x1000)	93.5 (53.4, 164.9)	202.5 (127.6, 289.1)	0.02
	CDC20 (x1000)	486.8 (214.4, 828.6)	1042.5 (505.2, 1353.5)	0.04
	KPNA2 (x1000)	251.8 (158.6, 339.9)	426.6 (272.9, 761.9)	0.01
	MYBL2 (x1000)	251.4 (129.0, 483.6)	421.9 (318.5, 636.9)	0.01
	GGIr^∗∗ (x1000)	322.5 (154.5, 440.5)	588.0 (376.6, 757.8)	0.01
PET-non radiomics features	Tumor SUV_max	9.2 (7.0, 12.5)	13.2 (8.5, 20.1)	0.07
PET-non radiomics features	Lymph nodes SUV_max	1.4 (0, 9.6)	2.2 (0, 5.8)	0.95
	Metabolic tumor volume (cm³)	9.3 (4.9, 17.2)	6.6 (3.0, 23.6)	0.70
Radiomics	Morphological-sphericity (x1000)	945.5 (911.5, 959.3)	935.0 (859.0, 959.0)	0.65
	Intensity-skewness (x1000)	473.5 (321.3, 664.8)	610.0 (329.0, 816.0)	0.56
	Discretized-intensity-uniformity (x1000)	1.2 (0.9, 1.8)	1.9 (1.3, 2.4)	0.05
	GLCM-contrast	31.0 (12.4, 46.6)	50.0 (18.0, 68.8)	0.12
	GLDZM-large distance low grey level emphasis (x1000)	2.2 (1.6, 4.5)	1.5 (1.1, 3.4)	0.14
	NGLDM-low dependence low grey level emphasis (x10000)	0.8 (0.5, 1.1)	0.5 (0.4, 1.0)	0.14
^∗ Clinical classification before FDG-PET/CT according to the eighth edition of the AJCC Staging Manual.
EC-D, sequential regimen of 4 cycles of Epirubicin 75 mg/m² plus Cyclophosphamide 750 mg/m² followed by 4 courses of Docetaxel 100 mg/m²; SIM, intensified regimen of Epirubicin 75 mg/m² plus Cyclophosphamide 1200 mg/m² for 6 cycles; DC, 6 cycles of Docetacel + Cyclophosphamide; pCR, Pathological Complete Response.
^∗∗GGIr: reduced genomic grade index (calculated after batch-effect correction); GGIr represents a combination of 4 genes (CDC2, CDC20, KPNA2 and MYBL2), covering all phases of the cell cycle.
GLCM: grey level co-occurrence matrix; GLDZM: grey level distance zone matrix; NGLDM: neighbouring grey level dependence matrix
Continuous data: Median (Q1: first quartile, Q3: third quartile)
Categorical data: Amount (Percentage)
p-values obtained from Wilcoxon rank-sum test for continuous data, Fisher’s exact test for categorical data.

Features in Bold: features selected for the final analysis

P value in Bold: significant P value

No significant relation between the chemotherapy regimen and pCR was observed but the number of patients treated with EC-D was limited (pCR rate: 37.5% for the 8 patients treated with EC-D vs. 36.7% for the 49 patients who received SIM, P = 1.0).

We found no significant relation between tumor histology and pCR (P = 0.64) and between grade and pCR (P = 0.15); however, most patients had high-grade invasive non-specific type carcinoma. Ki67 score measured by IHC and gene expression of Ki67 were not associated with pCR. Of gene expression profiling, GGIr (and its components) was associated with the pathological response (Table 2). Pathological complete response (pCR) was more frequent in T1-T2 tumors than in T3-T4 tumors (P = 0.03). The absolute value of SUV measured at baseline PET was not significantly associated with pCR (P = 0.07); however, there was a trend for higher SUV_max value in the case of pCR (13.2 vs. 9.2).

Prediction of pCR with ML algorithm

Of the 241 features collected for each patient, 17 features were selected for the final analysis (Table 2: features in bold): three clinical features (clinical T-stage, contraception, and family history of breast cancer), two histopathological features (mitoses count, Ki-67 score determined by automate), three molecular features (Ki-67 mRNA expression, TP53 mutational status evaluated by functional assay and GGIr), three PET-non radiomics features (tumor SUV_max, lymph node SUV_max, and MTV), and six radiomics parameters (morphological-sphericity, intensity-skewness, discretized-intensity-uniformity, GLCM-contrast, GLDZM-large distance low grey level emphasis and NGLDM-low dependence low grey level emphasis). The performance to predict pCR of each machine-learning model was estimated according to several sets of the multimodal predictors (Table 3). The best ML predictive model was a Support Vector Machine (SVM) algorithm with a linear kernel (Table 3), and the best predictive results were achieved using the aggregation of clinical data, histopathological and molecular features, and PET data, including radiomics features (Table 3). Considering the SVM algorithm, AUC was respectively 0.63 (95% CI = 0.51–0.73) for clinical + histo-pathological + PET non-radiomics data (first set), 0.70 (0.60–0.80) by adding molecular data (second set), and finally 0.82 (0.74–0.90) by also including the whole set of radiomics features. The estimated AUC was significantly better when considering the whole set of multimodal data than using the first and second sets of data modalities mentioned above, with P = 0.001 (whole set of data vs. first set) and P = 0.03 (whole set of data vs. second set), respectively. Compared to the value of 0.82 for SVM, the AUC were 0.67 for the Decision Tree algorithm, 0.65 for the Random Forest algorithm and 0.64 for the Logit algorithm (Table 3).

Table 3

Estimated predictive performances according to different sets of multimodal predictors and several Machine Learning algorithms.
Data modalities		Model	AUC	Accuracy	Se	Sp	PPV	NPV
	clinical + histo-pathological and PET non radiomics features	SVM	0.63	0.53	0.57	0.48	0.6§	0.40
1		DecisionTree	0.45	0.45	0.63	0.27	0.59	0.29
		RandomForest	0.56	0.48	0.67	0.30	0.62	0.35
		Logit	0.49	0.44	0.15	0.73	0.50	0.34
	1 + genomic data	SVM	0.70	0.61	0.58	0.63	0.73	0.47
2		DecisionTree	0.70	0.67	0.72	0.62	0.77	0.56
		RandomForest	0.65	0.64	0.76	0.53	0.73	0.56
		Logit	0.65	0.61	0.61	0.60	0.72	0.47
	2 + whole set of radiomics features	SVM	0.82	0.71	0.71	0.70	0.80	0.59
3		DecisionTree	0.67	0.65	0.74	0.56	0.74	0.55
		RandomForest	0.65	0.54	0.74	0.35	0.66	0.44
		Logit	0.64	0.60	0.60	0.61	0.72	0.47
AUC: Area under the ROC curve. Se: Sensitivity; Sp: specificity; PPV: positive predictive value: NPV: negative predictive value: SVM: support vector machines (with linear kernel)
The metrics were estimated using the nested leave-pair-out cross-validation

Figure 3 shows the estimated coefficients of each feature in the SVM model. Tumor SUV_max, GGIr and clinical T-stage were found as the three most important predictors. Four of the seven most important features were derived from PET imaging (baseline tumor SUV_max, MTV, and 2 radiomics features: Morphological-sphericity and NGLDM-low dependence low grey level and emphasis).

Event-free survival

Among the 56 patients included for EFS analysis, 15 patients relapsed during the follow-up period. As shown in Fig. 4, patients with predicted pCR tended to have better EFS than patients with predicted non-pCR, even though this difference was not significant probably due to small sample size and few events observed (P = 0.09).

In a homogenous series of 57 TNBC patients, although baseline PET SUV_max alone was not predictive of pathological response after neoadjuvant chemotherapy, a dataset of 17 pretherapeutic features (mixing clinical, histopathological, molecular and pretreatment PET findings) processed with a machine learning algorithm was highly predictive of pCR (AUC = 0.82 (95% CI = 0.74–0.90) with the SVM algorithm). In a previous study with TNBC patients, we observed that tumor baseline SUV_max value combined with some molecular features were predictive of pCR (14), but with limited performances (AUC = 0.76 for baseline SUV_max + GGI) (14). The change of FDG uptake between baseline PET and an interim PET performed after 2 cycles of NAC (ΔSUV_max) was related with the pCR rate with a higher accuracy (AUC = 0.81) (14). In the present study, the interim PET was not used. We focused on a panel of features, all determined at baseline, and the Support Vector Machine (SVM) algorithm predicted the pCR with an AUC of 0.82. Among various prediction models (logistic regression with regularization methods, binary decision trees, random Forests and support vector machine techniques (linear kernel)), the SVM algorithm had better performances.

Machine-learning and deep-learning are emerging in nuclear medicine. Very few studies have evaluated the usefulness of AI in predicting response to breast cancer treatment and patient’s outcome. In 56 breast cancer patients, the AUC for predicting histopathological response after NAC improved after deep learning using a convolutional neural network (CNN) (17). Different breast cancer subtypes were included, and subgroup analysis revealed that the datasets were not predictive in the triple-negative group (17). Only eight patients had a TNBC and only four parameters were analyzed (3 PET features and 1 MRI feature). In another study about FDG PET/CT imaging and AI, a CNN model was also useful to predict pCR (accuracy of 84.79%), but 31 patients with breast cancer of mixed subtype were included and only 3 had a TNBC (26).

Also, some other studies have explored multimodal analysis including MRI radiomics features to predict pCR in breast cancer. Similarly, the addition of clinical, biological and radiomics parameters was shown to improve predictive performance (27) paving the way for combined predictive biomarker strategies.

In our study of 57 TNBC patients, the developed ML algorithm could predict pCR using a dataset of 17 features. Best result was obtained using the aggregation of clinical, histopathological, genomics, and PET features, highlighting the importance of a truly multimodal analysis. Withdrawing a specific data modality (e.g., radiomics features or genomic data), led to a decrease of ̃almost 10% of the AUC (Table 3). PET features, including radiomics were of importance in the model.

Four of the seven most important features were derived from PET imaging (baseline tumor SUV_max, MTV, and 2 radiomics features: Morphological-sphericity and NGLDM-low dependence low grey level emphasis). These features were complementary, describing the tumor through its size, its shape, its voxel intensity distribution, and its texture.

Some of these criteria probably define triple-negative tumors well, which are aggressive tumors that can sometimes proliferate rapidly, heterogeneously, with areas of necrosis (28). Optimal characteristics would likely be different for other molecular subtypes, including texture parameters, indicating the utility of developing specific predictive criteria for each of the molecular subgroups of breast cancer. Previously, in 143 consecutive ER+/HER2- breast cancer patients from our cohort (NCT02600442), we found a significant association between patient outcome and selected PET parameters measured at baseline: tumor SUV_max, MTV, total lesion glycolysis, and entropy (29). Other teams have also evaluated the ability of radiomic parameters to predict response in patients treated with neoadjuvant chemotherapy for breast cancer, but these works have involved tumors with different molecular subgroups (15, 30–32). In a study of 79 patients with BC of different subtypes, the relationship between pretreatment PET parameters, including radiomics features, and pCR to NAC was analyzed by multiple logistic regression models (32). All models showed that the molecular subtype of the tumor was the main predictor (32).

Contrary to previously published studies, focus on a homogeneous group of breast cancers is a strength of our study. However, the study is also based on an extended follow-up as the patients were treated between 2008 and 2015, and the follow-up period goes until 2023. Recurrence of triple-negative breast cancer usually occurs early, within 2–3 years after completion of neoadjuvant chemotherapy (3). As shown in Fig. 4, patients with predicted pCR by the multimodal algorithm tended to have better event free survival than patients with predicted non-pCR, even though this difference was not significant probably due to small sample size and few events observed. Our observation is however in agreement with the literature data where pCR is a strong predictor of survival in TNBC (4).

Our study has some limitations. The study was conducted at a single center and performed retrospectively on a limited cohort because we chose to focus on a homogeneous group of triple-negative breast cancer. Our results will have to be confirmed by a large prospective multicenter study. But this study is proof of concept of the potential value of multimodal models evaluating routinely available clinical, pathological, biological and nuclear imaging parameters to predict response to neoadjuvant chemotherapy and guide the treatment regimen choice for patients.

In conclusion, our study suggests that machine learning applied to baseline multimodal data can help predict pCR status after neoadjuvant chemotherapy for TNBC patients and seem correlated to long-term outcomes. Identifying patients predicted not to achieve pCR under standard chemotherapy, right from diagnosis, could enable modulation of the usual treatment, with adjunction of immunotherapy or dose intensification. These tools can allow precision medicine, which should also be applied to chemotherapy regimens.

Conflict of interest

The authors declared no conflict of interest.

Funding source

This study was in part supported by an academic grant from the French national cancer institute (Grant Reference Number “Translational research in oncology” INCa-DGOS-5697).

Ethical Approval

approval was not required.

Author Contribution

D.G and J.L-C wrote the main manuscript text and L.F. prepared figures. All authors reviewed the manuscript.

NCCN Clinical Practice Guidelines in Oncology. Breast Cancer. Version 1. 2024. Available at:https://www.nccn.org/professionals/physician_gls/pdf/breast.pdf. .
Gralow JR, Burstein HJ, Wood W, et al. Preoperative therapy in invasive breast cancer: pathologic assessment and systemic therapy issues in operable disease. J Clin Oncol. 2008;26(5):814-819.
Carey LA, Dees EC, Sawyer L, et al. The triple negative paradox: primary tumor chemosensitivity of breast cancer subtypes. Clin Cancer Res. 2007;13(8):2329-2334.
Cortazar P, Zhang L, Untch M, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164-172.
Groheux D. Predicting pathological complete response in breast cancer early. Lancet Oncol. 2014;15(13):1415-1416.
Groheux D, Hindié E, Giacchetti S, et al. Triple-Negative Breast Cancer: Early Assessment with 18F-FDG PET/CT During Neoadjuvant Chemotherapy Identifies Patients Who Are Unlikely to Achieve a Pathologic Complete Response and Are at a High Risk of Early Relapse. J Nucl Med. 2012;53(2):249-254.
Koolen BB, Pengel KE, Wesseling J, et al. Sequential (18)F-FDG PET/CT for early prediction of complete pathological response in breast and axilla during neoadjuvant chemotherapy. Eur J Nucl Med Mol Imaging. 2014;41(1):32-40.
Groheux D, Hindié E, Giacchetti S, et al. Early assessment with 18F-fluorodeoxyglucose positron emission tomography/computed tomography can help predict the outcome of neoadjuvant chemotherapy in triple negative breast cancer. Eur J Cancer. 2014;50(11):1864-1871.
Humbert O, Riedinger J-M, Charon-Barra C, et al. Identification of biomarkers including 18FDG-PET/CT for early prediction of response to neoadjuvant chemotherapy in Triple Negative Breast Cancer. Clin Cancer Res. 2015;21(24):5460-5468.
Groheux D, Biard L, Giacchetti S, et al. 18F-FDG PET/CT for the Early Evaluation of Response to Neoadjuvant Treatment in Triple-Negative Breast Cancer: Influence of the Chemotherapy Regimen. J Nucl Med. 2016;57(4):536-543.
Groheux D, Mankoff D, Espié M, Hindié E. (18)F-FDG PET/CT in the early prediction of pathological response in aggressive subtypes of breast cancer: review of the literature and recommendations for use in clinical trials. Eur J Nucl Med Mol Imaging. 2016;43(5):983-993.
Salaün P-Y, Abgral R, Malard O, et al. Good clinical practice recommendations for the use of PET/CT in oncology. Eur J Nucl Med Mol Imaging. 2020;47(1):28-50.
Bertucci F, Finetti P, Roche H, et al. Comparison of the prognostic value of genomic grade index, Ki67 expression and mitotic activity index in early node-positive breast cancer patients. Ann Oncol. 2013;24(3):625-632.
Groheux D, Biard L, Lehmann-Che J, et al. Tumor metabolism assessed by FDG-PET/CT and tumor proliferation assessed by genomic grade index to predict response to neoadjuvant chemotherapy in triple negative breast cancer. Eur J Nucl Med Mol Imaging. 2018;45(8):1279-1288.
Li P, Wang X, Xu C, et al. 18F-FDG PET/CT radiomic predictors of pathologic complete response (pCR) to neoadjuvant chemotherapy in breast cancer patients. Eur J Nucl Med Mol Imaging. 2020;47(5):1116-1126.
Balkenende L, Teuwen J, Mann RM. Application of Deep Learning in Breast Cancer Imaging. Semin Nucl Med. 2022;52(5):584-596.
Choi JH, Kim H-A, Kim W, et al. Early prediction of neoadjuvant chemotherapy response for advanced breast cancer using PET/MRI image deep learning. Sci Rep. 2020;10(1):21149.
Nielsen TO, Leung SCY, Rimm DL, et al. Assessment of Ki67 in Breast Cancer: Updated Recommendations From the International Ki67 in Breast Cancer Working Group. J Natl Cancer Inst. 2021;113(7):808-819.
Dumay A, Feugeas J-P, Wittmer E, et al. Distinct tumor protein p53 mutants in breast cancer subgroups. Int J Cancer. 2013;132(5):1227-1231.
Toussaint J, Sieuwerts AM, Haibe-Kains B, et al. Improvement of the clinical applicability of the Genomic Grade Index through a qRT-PCR test performed on frozen and formalin-fixed paraffin-embedded tissues. BMC Genomics. 2009;10:424.
Zwanenburg A, Vallières M, Abdalah MA, et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020;295(2):328-338.
Giacchetti S, Porcher R, Lehmann-Che J, et al. Long-term survival of advanced triple-negative breast cancers with a dose-intense cyclophosphamide/anthracycline neoadjuvant regimen. Br J Cancer. 2014;110(6):1413-1419.
Zhang Y, Jenkins DF, Manimaran S, Johnson WE. Alternative empirical Bayes models for adjusting for batch effects in genomic studies. BMC Bioinformatics. 2018;19(1):262.
Buuren S van, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software. 2011;45:1-67.
Vabalas A, Gowen E, Poliakoff E, Casson AJ. Machine learning algorithm validation with a limited sample size. PLoS One. 2019;14(11):e0224365.
Bulut G, Atilgan HI, Çınarer G, Kılıç K, Yıkar D, Parlar T. Prediction of pathological complete response to neoadjuvant chemotherapy in locally advanced breast cancer by using a deep learning model with 18F-FDG PET/CT. PLoS One. 2023;18(9):e0290543.
Pesapane F, Rotili A, Botta F, et al. Radiomics of MRI for the Prediction of the Pathological Response to Neoadjuvant Chemotherapy in Breast Cancer Patients: A Single Referral Centre Analysis. Cancers (Basel). 2021;13(17):4271.
García-Castro A, Zonca M, Florindo-Pinheiro D, et al. APRIL promotes breast tumor growth and metastasis and is associated with aggressive basal breast cancer. Carcinogenesis. 2015;36(5):574-584.
Groheux D, Martineau A, Teixeira L, et al. 18FDG-PET/CT for predicting the outcome in ER+/HER2- breast cancer patients: comparison of clinicopathological parameters and PET image-derived indices including tumor texture analysis. Breast Cancer Res. 2017;19(1):3.
Molina-García D, García-Vicente AM, Pérez-Beteta J, et al. Intratumoral heterogeneity in 18F-FDG PET/CT by textural analysis in breast cancer as a predictive and prognostic subrogate. Ann Nucl Med. 2018;32(6):379-388.
Yoon H-J, Kim Y, Chung J, Kim BS. Predicting neo-adjuvant chemotherapy response and progression-free survival of locally advanced breast cancer using textural features of intratumoral heterogeneity on F-18 FDG PET/CT and diffusion-weighted MR imaging. Breast J. 2019;25(3):373-380.
Antunovic L, De Sanctis R, Cozzi L, et al. PET/CT radiomics in breast cancer: promising tool for prediction of pathological response to neoadjuvant chemotherapy. Eur J Nucl Med Mol Imaging. 2019;46(7):1468-1477.

No competing interests reported.

AdditionalMaterials.docx

Download PDF

Reviews received at journal
18 Nov, 2024
Reviewers agreed at journal
08 Nov, 2024
Reviews received at journal
26 Sep, 2024
Reviewers agreed at journal
17 Sep, 2024
Reviewers invited by journal
16 Sep, 2024
Editor assigned by journal
09 Sep, 2024
Submission checks completed at journal
09 Sep, 2024
First submitted to journal
06 Sep, 2024

You are reading this latest preprint version

FDG-PET/CT and multimodal machine learning model prediction of pathological complete response to neoadjuvant chemotherapy in triple-negative breast cancer

Status:

Version 1

Abstract

Figures

INTRODUCTION

PATIENTS AND METHODS

RESULTS

Patient’s characteristics and building of the database for machine learning

Association between main features and pCR (univariate analysis)

Prediction of pCR with ML algorithm

Event-free survival

DISCUSSION

CONCLUSION

Declarations

Author Contribution

References

Additional Declarations

Supplementary Files

Status:

Version 1