Deep transfer learning radiomics based on two-dimensional ultrasound for predicting the efficacy of neoadjuvant chemotherapy in breast cancer

doi:10.21203/rs.3.rs-2427398/v1

Download PDF

Research Article

Deep transfer learning radiomics based on two-dimensional ultrasound for predicting the efficacy of neoadjuvant chemotherapy in breast cancer

https://doi.org/10.21203/rs.3.rs-2427398/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Purpose

We investigate the predictive value of a comprehensive model based on preoperative ultrasound radiomics, deep migration learning, and clinical features for pathological complete response (pCR) after neoadjuvant chemotherapy (NAC) for the breast cancer.

Methods

We enrolled 211 patients with pathologically confirmed the breast cancer who underwent NAC. The patients were randomly divided into the training set and the validation set in the ratio of 7:3. The deep learning and radiomics features of pre-treatment ultrasound images were extracted, and the random forest recursive elimination algorithm and the least absolute shrinkage and selection operator were used for feature screening and DL-Score and Rad-Score construction. According to multiple logistic regression, independent clinical predictors, DL-Score, and Rad-Score were selected to construct the comprehensive prediction model DLR + C. The performance of the model was evaluated in terms of its predictive effect, calibration ability, and clinical practicability.

Result

Compared to the clinical, radiomics (Rad-Score), and deep learning (DL-Score) models, the DLR + C accurately predicted the pCR status, with an area under the curve (AUC)of 0.906 (95% CI: 0.871–0.935) in the training set and 0.849 (95% CI: 0.799–0.887) in the validation set, with good calibration ability (Hosmer-Lemeshow: P > 0.05). Moreover, decision curve analysis confirmed that the DLR + C had the highest clinical value among all models.

Conclusion

The comprehensive model DLR + C based on ultrasound radiomics, deep transfer learning, and clinical features can effectively and accurately predict the pCR status of breast cancer after NAC, which is conducive to assisting clinical personalized diagnosis and treatment plan.

Breast ultrasound

radiomics

deep transfer learning

invasive ductal carcinoma

pathological complete response

The incidence of breast cancer ranks the first worldwide and represents the main malignant tumor endangering the health of women [1]. With the continuous development of comprehensive treatment of breast cancer, the survival of patients with breast cancer has been significantly prolonged. Neoadjuvant chemotherapy (NAC) is central to the healing process, and it has a good effect on reducing the clinical stage of the tumor, effectively reducing the tumor volume, and reducing lymph node metastasis. Surgical opportunities provided to patients with late-stage breast cancer increases the possibility of breast preservation and improves the prognosis of patients [2, 3]. Studies have shown that patients who achieved pathological complete response (pCR) after neoadjuvant therapy had better long-term effects and had higher disease-free survival (DFS) and overall survival (OS) than those who achieved non-pCR [4]. Due to the complexity and heterogeneity of breast tumors, some patients have a certain probability of disease progression and insensitivity to chemotherapy during NAC [5]. Therefore, it is necessary to predict whether NAC for lesions can achieve pCR at an early stage, which would help clinicians to avoid unnecessary treatment, select the next treatment, avoid unnecessary toxic and side effects, and enable more patients to benefit from NAC.

Ultrasound, mammography, MRI, and other imaging examinations are central to the screening, diagnosis, staging, and efficacy monitoring of breast cancer [6, 7]. Numerous studies have shown that imaging examinations can also predict the tumor after NAC and the pCR of axillary lymph nodes and has excellent diagnostic efficacy [8–11]. The response evaluation criterion in solid tumor (RECIST) describes the efficacy before and after NAC as evaluated by ultrasound, MRI, and mammography measure the maximum diameter of the target lesion tumor [12, 13]. The commonly used clinical pCR evaluation depends on the postoperative pathological results of surgical specimens. However, as the above examinations require results that occur after NAC, it is impossible to adjust the NAC therapeutic schemes in a timely manner according to the pCR results in the early stage of NAC [14].

Ultrasound is the first choice for breast cancer screening because it is economical, non-invasive, fast, and images are taken in real time. The Breast Imaging Reporting and Data System (BI-RADS) is used to evaluate the mass and axillary lymph nodes, which are widely used in the evaluation of tumors in NAC treatment [15]. As an emerging technology for extracting quantitative image features from standard medical images, radiomics has become increasingly important in cancer research for improving the accuracy of diagnosis and prognosis [16]. Compared to traditional radiomics, deep learning convolutional neural networks (CNNs) can automatically extract high-throughput image features and perform well in cancer detection, monitoring, and risk assessment [17]. However, a complete training CNN model requires many training datasets, and in clinical practice, it is difficult and expensive to collect many medical image data in specific tasks. Therefore, deep transfer learning has become an effective method to solve this problem [18]. CNN, which is completed by pre-training in natural images, is applied to extract many depth features in medical images and is combined with radiomics technology to evaluate the staging of liver fibrosis [19] and identify benign and malignant thyroid nodules [20] and breast tumors [21], with high diagnostic efficacy. However, in recent years, most studies have conducted post-NAC pCR status analysis based on MRI radiomics combined with deep transfer learning [22], while studies on ultrasound image prediction are less available [23]. Compared to MRI, ultrasound is more suitable for multiple examinations of patients during NAC.

In this study, we predicted whether breast cancer achieved pCR after NAC based on a comprehensive model constructed based on the radiomics of preoperative ultrasound, deep transfer learning, and clinical features.

Patients

This study was approved by the Ethics Committee of The affiliated Changzhou Second People’s Hospital of Nanjing Medical University (authorization number: [2020]KY154-01). Patients receiving complete NAC from January 2015 to December 2021 were selected for retrospective analysis. All eligible patients met the following inclusion criteria: (1) breast unilateral primary breast cancer with no distant metastasis as confirmed by biopsy; (2) pCR was confirmed by postoperative pathological examination; and (3) no previous treatment except NAC and no history of other malignant tumors. The exclusion criteria were as follows: (1) poor ultrasound image quality or large tumor such that the tumor boundary could not be fully displayed; (2) incomplete clinical or imaging data; and (3) multiple lesions of ipsilateral mammary glands or lesions of both mammary glands. A total of 211 patients were enrolled, aged 28–85 years, with an average age of 55.59 ± 11.77 years. All of the included patients with breast cancer had received 4–8 cycles of NAC before surgery. The NAC regimen combined taxane, anthracycline, or taxane with anthracycline, while Her2-positive patients received targeted therapy (an initial dose of 8 mg/kg, followed by a maintenance dose of 6 mg/kg).

Ultrasound examinations and image interpretation

All imaging was performed by physicians with more than 5 years of experience in breast ultrasound diagnosis using a Philips EPIQ5, IU22 (Philips, the Netherlands), Esaote Mylabe Twice (Esaote, Italy) with 7–12 MHz high-frequency linear array probe. Before data collection, the breast ultrasound mode was selected and the machine was uniformly set up. During the examination, the patient remained in the supine position with her arms raised to fully expose the breast. All quadrants of both breasts were scanned, with a focus on the lesion area. The static images of the tumor in the longest axis were selected and stored in DICOM format for subsequent evaluation and analysis. Ultrasound examinations were completed in all patients within 2 weeks before NAC treatment.

Images were analyzed by two physicians with 3 and 10 years of experience in breast ultrasound according to the US BI-RADS [24]. In cases of disagreement, the conclusion was reached by a physician with 20 years of experience in breast ultrasound diagnosis to eliminate the differences between the two sonographers. No physician was aware of the clinical data before the image analysis. The following nine ultrasound features were evaluated for each mass: maximum diameter, location (outer upper quadrant, outer lower quadrant, inner upper quadrant, inner lower quadrant, or posterior nipple), shape (round, oval, or irregular), growth direction (parallel or vertical), echo (hypoechoic or heterogeneous), boundary (clear, blurred, angular, differential blade or burr), calcification (positive or negative), posterior echo (no attenuation or attenuation), and color Doppler flow (positive or negative).

Pathological evaluation and clinical data collection

The preoperative tumor biopsy tissues and postoperative specimens were made into slides. Two pathologists with more than 10 years of experience and no knowledge of the clinical data diagnosed the pathological tissues, and the samples with differences in evaluation were discussed repeatedly until a consensus was reached. The efficacy of NAC was evaluated according to the American Joint Committee on Cancer eighth edition cancer staging manual. pCR was defined as no residual invasive cancer tissue or only residual carcinoma in situ after pathological examination of the breast and lymph nodes, while non-pCR was defined as residual infiltrating cancer tissue after pathological examination of the breast and lymph nodes [25].

Clinical data included age, clinical stage, estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor2 (HER2), tumor expression of the proliferation antigen (Ki67) index, ultrasonic BI-RADS grading, and two-dimensional ultrasonic characteristics. The ER, PR, and HER2 status and the Ki67 index were assessed by immunohistochemistry (IHC). Positive ER/PR was defined as at least 1% invasive tumor cells as detected by IHC. IHC HER2 status was defined as positive when IHC (3+) or IHC (2+) was amplified by fluorescence in situ hybridization (FISH). Contrary to the above data, is defined as negative. Ki67 index defined high expression (≥ 20%) and low expression (< 20%).

Image segmentation and feature extraction

DICOM images were imported into ITK-SNAP software, and regions of interest (ROIs) were manually drawn along the tumor contour on the ultrasound images by two experienced ultrasound diagnostic physicians (1, 2) who were unaware of the pathological results. In cases of disagreement surrounding the segmented ROIs, another senior superficial ultrasound diagnosis-experienced physician participated in the discussion until a consensus was reached. The segmentation effect is shown in Fig. 1. Intra-class correlation coefficients (ICC) were used to assess intra-observer and inter-observer consistency to ensure repeatability of radiomic feature extraction. First, 20 ultrasound images were randomly selected from the training set to assess the inter-observer repeatability. Second, 1 week later, the images were repeatedly segmented by ultrasound diagnostic physicians to evaluate the intra-observer repeatability; only features with an ICC > 0.8 were selected for further analysis.

Based on the Pyradiomics (V3.0.1) package in the Python3.6 environment, manually defined radiomics features were extracted for each segmented image (https://pyradiomics.readthedocs.io/en/latest/)[27]. Radiomics features included first-order statistic feature, two-dimensional shape feature, texture feature, and small wave feature, which were divided into two dimensions and four frequency bands (HH, HL, LH, LL).

ResNet50[28] was used as the basic model for deep learning feature extraction, and pre-training was conducted on the large-scale, well-annotated ImageNet dataset to automatically learn the differences between image features. The global maximum pooling layer was used to obtain the maximum value of the feature graph at each layer to convert the feature graph to the original value. All deep learning programs were implemented under the TensorFlow framework using an Intel Core i7-11700F processor and NVIDIA GeForce RTX 3060GPU. Its over-all architecture is shown in Fig. 2 .

Data pre-processing

The original data were randomly grouped in the ratio of 7:3 using the model_selection module in the sklearn library. Among the patients, 147 patients (including 54 pCR patients) were included in the training set for training and adjusting the model, and 64 patients (24 pCR patients) were included in the validation set for verifying the stability of the model. Z-score normalization of the overall data was conducted to convert the feature data of different orders of magnitude into the same order of magnitude to ensure the comparability between features and facilitate the subsequent application of screening algorithms.

$$Z=\frac{x-\stackrel{-}{x}}{s}$$

where x represents initial data, x ̅ represents the average number, and s represents the standard deviation.

Radiomics feature selection and radiomics score construction

The random forest-based recursive elimination algorithm and the least absolute shrinkage and selection operator (LASSO) of 10-fold cross validation were used to conduct dimensionality reduction in the training set; the dataset with the least cross-validated binomial bias was selected as the optimal feature set. The radiomics score (Rad-Score) and deep learning score (DL-Score) were constructed to achieve the best model.

$$Rad-Score (DL-Score) =intercept+\sum _{n=1}^{features}\left({feature}_{n}{Coef}_{n}\right)$$

where the intercept is the intercept obtained after fitting the training set data using the LASSOCV model, feature is the feature filtered by LASSO, Coef is the characteristic regression coefficient, and features denote the number of filtered features.

The machine learning classifier used in this study is Logistic Regression (LR), which avoids overfitting by choosing L2 regularization. As the samples of the pCR and NpCR groups were not balanced, we used the class_weight = “balanced” attribute to balance the weight of the two types of samples. The receiver operating characteristic curve (ROC), area under the curve (AUC), accuracy, specificity, sensitivity, positive predictive value, and negative predictive value were calculated to evaluate the performance of the model. The 95% confidence interval (CI) of the AUC was obtained through 1000 re-sampling, and the Delong test was used to compare the differences between different models of AUC. The calibration capability of the model was checked by a calibration curve. Finally, decision curve analysis (DCA) was used to calculate the standard net income under the probability of 0–1 threshold to evaluate the clinical value of the model.

Statistics analysis

Statistical analysis was performed using SPSS 23.0 (IBM SPSS) software. Quantitative data conforming to a normal distribution are expressed as the mean ± SD, and independent sample t-test was used for comparison between groups. Data that did not conform to a normal distribution were represented by M(Q1, Q3), and the Mann–Whitney U test was used for comparison between groups. Qualitative data are represented as examples, and the Chi-square test was used for comparison between groups. P-values < 0.05 were considered to indicate a significant difference.

Clinical characteristics

The clinical data of the patients are shown in Table 1. Except for ER, PR, blood flow, and posterior echo, there was no significant difference between the pCR and NpCR groups in the training set and the validation set. Multivariate LR analysis (Table 2) showed that the ER and posterior echo were independent predictors of pCR status and were included in the clinical model. The AUC of the clinical model in the training set was 0.809 (95% CI: 0.764–0.855), while that in the validation set was 0.748 (95% CI: 0.696–0.796).

Table 1

Clinical characteristics comparison for the training and testing sets.
Characteristic	Training set(n = 147)			Validation set(n = 64)
	pCR(n = 54)	NpCR(n = 93)	P	pCR(n = 24)	NpCR(n = 40)	P
Age(years)	54.70 ± 11.02	56.10 ± 12.21	0.491	54.08 ± 13.58	55.60 ± 12.06	0.644
Clinical stage			0.796			0.401
I	4(7.41)	10(10.8)		0(0)	2(5.0)
II	36(66.6)	59(63.4)		15(62.5)	20(50.0)
III	14(25.9)	24(25.8)		9(37.5)	18(45.0)
ER status			< 0.001			< 0.001
Positive	19(35.2)	71(76.3)		9(37.5)	34(85.0)
Negative	35(64.8)	22(23.7)		15(62.5)	6(15.0)
PR status			< 0.001			0.002
Positive	16(29.6)	56(60.2)		6(25.0)	26(65.0)
Negative	38(70.4)	37(39.8)		18(75.0)	14(35.0)
HER2 status			0.085			0.193
Positive	30(55.6)	38(40.9)		13(54.2)	15(37.5)
Negative	24(44.4)	55(59.1)		11(45.8)	25(62.5)
Ki67 status			0.132			0.118
high expression	52(96.3)	83(89.2)		23(95.8)	33(82.5)
low expression	2(3.7)	10(10.8)		1(4.2)	7(17.5)
BI-RADS			0.316			0.187
4a	1(1.9)	0(0)		1(4.2)	0(0)
4b	7(13.0)	10(10.8)		6(25.0)	4(10.0)
4c	19(35.1)	44(47.3)		8(33.3)	20(50.0)
5	27(50)	39(41.9)		9(37.5)	16(40.0)
Longest diameter	3.05 ± 1.39	3.44 ± 1.45	0.116	2.76 ± 0.69	3.35 ± 0.99	0.014
Location			0.593			0.650
Outer upper	31(57.3)	49(52.6)		17(70.8)	24(60.0)
Outer lower	7(13.0)	13(14.0)		1(4.2)	2(5.0)
Inner upper	5(9.3)	6(6.5)		3(12.5)	3(7.5)
Inner lower	7(13.0)	21(22.6)		2(8.3)	9(22.5)
posterior nipple	4(7.4)	4(4.3)		1(4.2)	2(5.0)
Echo type			0.285			0.179
Hypoechoic	51(94.4)	83(89.2)		23(95.8)	34(85.0)
heterogeneous	3(5.6)	10(10.8)		1(4.2)	6(15.0)
Boundary			0.161			0.291
differential lobe	13(24.1)	24(25.8)		5(20.8)	13(32.5)
burr	3(5.6)	16(17.2)		2(8.3)	2(5.0)
angular	21(38.9)	25(26.9)		11(45.8)	11(27.5)
blurred	13(24.1)	25(26.9)		4(16.7)	13(32.5)
clear	4(7.4)	3(3.2)		2(8.3)	1(2.5)
Shape			0.730			0.765
irregular	42(77.8)	71(76.3)		18(75.0)	33(82.5)
oval	11(20.4)	18(19.4)		5(20.8)	6(15.0)
round	1(1.8)	4(4.3)		1(4.2)	1(2.5)
Calcification			0.247			0.243
negative	18(33.3)	40(43.0)		10(41.7)	11(27.5)
positive	36(66.7)	53(57.0)		14(58.3)	29(72.5)
Color Doppler flow			0.011			0.021
negative	10(18.5)	5(5.4)		7(29.2)	3(7.5)
positive	44(81.5)	88(94.6)		17(70.8)	37(92.5)
Growth direction			0.208			0.452
Parallel	37(68.5)	54(58.1)		17(70.8)	24(60.0)
vertica	17(31.5)	39(41.9)		7(29.2)	15(37.5)
Posterior echo			< 0.001			0.005
No attenuation	33(61.1)	13(14.0)		13(54.2)	8(20.0)
attenuation	21(38.9)	80(86.0)		11(45.8)	32(80.0)

Table 2

Multivariate logistic regression analysis of clinical data and two-dimensional ultrasound features
Features	β	SE	Wald	OR (95% CI)	P
Constant	–2.599	0.741	12.308	0.074 (0.017–0.318)	< 0.001
ER	1.239	0.457	7.339	3.451 (1.408–8.455)	0.007
PR	0.593	0.464	1.633	1.809 (0.729–4.493)	0.201
Color Doppler flow	0.997	0.688	2.102	2.711 (0.704–10.442)	0.147
Posterior echo	1.972	0.440	20.07	7.183 (3.032–17.021)	< 0.001

Radiomics feature selection and Rad-Score (DL-Score) development

A total of 469 radiomics features were extracted from the original two-dimensional ultrasound images, including 18 first-order statistical features, 14 two-dimensional shape features, 73 texture features, and 364 small porter features. First, 464 features were filtered by ICC > 0.8. Then, all the features were traversed based on the random forest recursive elimination algorithm, and the top 20 features with the largest proportion of weight were selected. Finally, LASSO dimension reduction retained seven features, and the Rad-Score was calculated as follows:

Rad-Score = 0.3673 − 0.1126×original_shape_Maximum2DDiameterSlice-0.0599×original_firstorder_RobustMeanAbsoluteDeviation-9.3260×wavelet-LH_gldm_GrayLevelNonUniformity-wavelet-0.0028×HL_glcm_Imc2-0.0226×wavelet-HL_glszm_LargeAreaEmphasis-0.0606×wavelet-HH_gldm_DependenceNonUniformityNormalized-0.0133wavelet-LL_glcm_Idn.

The screening of deep learning features was consistent with the radiomics features. A total of 2048 deep learning features were extracted by ResNet50, and nine features were retained for constructing the DL-Score after multiple screening.

DL-Score = 0.3673 + 0.0088×DL_282–0.0605×DL_422 + 0.0281×DL_438 + 0.0108×DL_454 + 0.0216×DL_567–0.0282×DL_1291–0.0595×DL_1341–0.0645×DL_1828 + 0.1111×DL_1848.

DLR + C construction and performance evaluation

The comprehensive model DLR + C was constructed by combining the valuable clinical parameters ER, posterior echo, Rad-Score, and DL-Score. A contrast model DLR based only on the Rad-Score and DL-Score was also constructed. Table 3 shows the prediction performance of each model based on LR. The AUC of DLR + C in the training set was 0.906 (95% CI: 0.871–0.935), and the accuracy, sensitivity, and specificity were 0.823, 0.833, and 0.817, respectively. The AUC of the validation set was 0.849 (95% CI: 0.799–0.887), and the accuracy, sensitivity, and specificity were 0.797, 0.750, and 0.825, respectively, which were significantly better than those of other models. The combined model DLR of the DL-Score and Rad-Score also showed better predictive performance than the Rad-Score, DL-Score, and clinical models. Figure 3 shows the ROC curve of each model in the training set and validation set. The DeLong test (Table 4) showed a significant difference between DLR + C and clinical and Rad-Score models in the training set; however, it was only statistically significant with the clinical (P < 0.01) model on the test set. Figure 4 shows the calibration curve of the DLR + C model, while the combined Hosmer-Lemeshow goodness-of-fit test (all P > 0.05) showed that the DLR + C model has good calibration ability in both the training set and the validation set. Figure 5 shows the decision curve, which demonstrated that the DLR + C model has higher clinical utility than the models for DLR, DL-Score, Rad-Score, clinical model, all treatment, and no treatment options.

Table 3

Prediction performance of each model based on LR
	AUC (95%CI)	Accuracy	Sensitivity	Specificity	Positive predictive value	Negative predictive value
Training set
Clinical	0.809 (0.764–0.855)	0.735	0.833	0.677	0.600	0.875
Rad-Score	0.757 (0.698–0.811)	0.680	0.778	0.624	0.545	0.829
DL-Score	0.868 (0.823–0.906)	0.782	0.815	0.763	0.667	0.877
DLR	0.878 (0.843–0.910)	0.796	0.815	0.785	0.688	0.880
DLR + C	0.906 (0.871–0.935)	0.823	0.833	0.817	0.726	0.894
Validation set
Clinical	0.748 (0.696–0.796)	0.688	0.750	0.650	0.562	0.812
Rad-Score	0.749 (0.692–0.800)	0.641	0.792	0.550	0.514	0.815
DL-Score	0.790 (0.744–0.835)	0.703	0.750	0.675	0.581	0.818
DLR	0.811 (0.763–0.852)	0.734	0.833	0.675	0.606	0.871
DLR + C	0.849 (0.799–0.887)	0.797	0.750	0.825	0.720	0.846

Table 4

DeLong test results between the DLR + C and each model
	Training set		Validation set
	Z	P	Z	P
Clinical	–4.702	< 0.001	–3.105	0.002
Rad-Score	–3.538	< 0.001	–1.436	0.151
DL-Score	–1.334	0.182	–0.917	0.359
DLR	–1.055	0.291	–0.637	0.524

In this study, we constructed a comprehensive prediction model DLR + C, which integrates clinical data, two-dimensional ultrasound features, radiomics, and deep transfer learning characteristics of ultrasonic images before treatment. The AUCs of the DLR + C model, clinical model, and Rad-Score model in the training set were significantly different. The DLR + C model can replace the traditional radiomics + clinical feature prediction model and predict the pCR status of patients with breast after NAC more accurately and non-invasively before surgery, which is expected to be helpful for clinical decision-making.

Radiomics, as a new artificial intelligence technology in medical imaging, can extract high-dimensional image features from medical images such as ultrasound, MRI, and CT to describe breast tumors more comprehensively. Previous studies have reported the value of radiomics in breast cancer prognosis prediction [29–31], but most of the studies have mainly focused on CT and MRI. Compared to MRI and CT, ultrasound has greater clinical and economic benefits in the preoperative evaluation of breast cancer pCR because of its lower cost, simpler procedure, and easier access to images. Yang et al. [32] compared the changes in ultrasound radiomics features before and in the early stage of treatment and found that the difference between the two was independently related to the NAC response of breast cancer, demonstrating that ultrasound radiomics features based on ultrasonography can be used to predict the curative effect of patients with breast cancer after NAC before surgery. Moreover, previous studies have focused on a single radiomics technology, which leads to limited prediction performance of the model.

We included 211 patients in this study, which was a larger sample size compared to previous studies, and the study was based on patients with diferent pathological types breast cancer to avoid the influence of tumor pathological type on the performance of the model. Additionally, we introduced a new deep transfer learning technique to extract deep learning features inside the ultrasound images. The results showed that the extracted deep learning features were better than the radiomics features in predicting the efficacy of NAC in patients with breast cancer. Our findings have the potential to significantly improve the diagnostic performance model, suggesting that deep learning features may be a valuable new indicator for predicting pCR and may have complementary relationships with radiomics features. Furthermore, the Delong test showed that the DLR + C model was not statistically different from the DLR and DL-Score models. This may be because deep learning features have a greater weight, either alone compared to radiomics features or in the fusion model DLR + C. Additionally, we analyzed the relevant clinical prognostic factors. Multivariate LR analysis showed that the ER and posterior echo were independent predictors of pCR status, which were included in the construction of the clinical-radiomics-deep learning comprehensive model DLR + C. In the validation set, the AUC and accuracy of DLR + C were as high as 0.849 and 0.797, respectively, which showed a certain improvement compared to previous studies based on radiomics and clinical characteristics. Additionally, compared to some previous studies in which ultrasonic image features needed to be extracted before and during NAC treatment to accurately predict the curative effect, our study only used ultrasound images before NAC treatment, which could help to advance the time point of clinical decision-making.

This study has some limitations. First, this study is a single-center retrospective study, and multi-center independent datasets are required to further verify the model. Second, regarding the ROI of each lesion, although we tried to ensure the repeatability of radiomic feature extraction in the study, human factors may still cause deviations in the feature extraction. Furthermore, our study is based on two-dimensional ultrasonic image analysis, and multi-modal ultrasound analysis will be added in subsequent studies to further improve the accuracy of the prediction model.

In summary, based on two-dimensional ultrasound images before NAC, we successfully established a prediction model of post-NAC pCR status for patients with invasive ductal carcinoma of the breast by using radiomics and deep transfer learning techniques along with two-dimensional ultrasound and clinical features. This model achieved good results in the validation set, which can predict whether NAC patients can achieve pCR earlier. This model can provide an effective reference for clinical pCR diagnosis and promote the implementation of personalized programs for tumor management.

Acknowledgments

Thanks to Professor Xiaoqin Li for her support of this scientific research. This study received funding from Special Fund of Science and Technology Program of Jiangsu Province (Key R&D Program for Social Development)(BE2022720). Institutional Review Board approval was obtained from the Changzhou No. 2 People’s Hospital affiliated with Nanjing Medical University. We thank LetPub (www. letpub. com) for its linguistic assistance during the preparation of this manuscript.

Author contributions

ZW: design, acquisition, , interpretation of data; TZ: acquisition, analysis of data; HZ: interpretation of data; CZ: acquisition of data; TD: analysis; XL: revised manuscript, guided the experimental process. LX: revised manuscript. All authors read and approved the fnal manuscript.

Funding

This study received funding from Special Fund of Science and Technology Program of Jiangsu Province (Key R&D Program for Social Development)(BE2022720).

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available to protect patient privacy but are available from the corresponding author upon reasonable request.

Ethics approval and consent to participate

All procedures performed in this study involving human participants were in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Ethics Committee of The affiliated Changzhou Second People’s Hospital of Nanjing Medical University (authorization number: [2020]KY154-01), and All study patients were provided with the written informed consent. All methods were performed in accordance with the relevant guidelines and regulations of the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Consent for publication

Not applicable.

Competing interests

The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article

Siegel RL, Miller KD, Fuchs HE, Jemal A, Cancer statistics. 2022. CA Cancer J Clin. 2022 Jan;72(1):7–33. doi: 10.3322/caac.21708.
Korde LA, Somerfield MR, Carey LA, Crews JR, Denduluri N, Hwang ES, Khan SA, Loibl S, Morris EA, Perez A, Regan MM, Spears PA, Sudheendra PK, Symmans WF, Yung RL, Harvey BE, Hershman DL. Neoadjuvant Chemotherapy, Endocrine Therapy, and Targeted Therapy for Breast Cancer: ASCO Guideline. J Clin Oncol. 2021 May 1;39(13):1485–1505. doi: 10.1200/JCO.20.03399.
Kerr AJ, Dodwell D, McGale P, Holt F, Duane F, Mannu G, Darby SC, Taylor CW. Adjuvant and neoadjuvant breast cancer treatments: A systematic review of their effects on mortality. Cancer Treat Rev. 2022 Apr;105:102375. 10.1016/j.ctrv.2022.102375.
Asaoka M, Narui K, Suganuma N, Chishima T, Yamada A, Sugae S, Kawai S, Uenaka N, Teraoka S, Miyahara K, Kawate T, Sato E, Nagao T, Matsubara Y, Gandhi S, Takabe K, Ishikawa T. Clinical and pathological predictors of recurrence in breast cancer patients achieving pathological complete response to neoadjuvant chemotherapy. Eur J Surg Oncol. 2019 Dec;45(12):2289–94. 10.1016/j.ejso.2019.08.001.
Wang Z, Lin F, Ma H, Shi Y, Dong J, Yang P, Zhang K, Guo N, Zhang R, Cui J, Duan S, Mao N, Xie H. Contrast-Enhanced Spectral Mammography-Based Radiomics Nomogram for the Prediction of Neoadjuvant Chemotherapy-Insensitive Breast Cancers.Front Oncol. 2021 Feb22;11:605230. doi: 10.3389/fonc.2021.605230.
Gao LY, Gu Y, Xu W, Tian JW, Yin LX, Ran HT, Ren WD, Mu YM, Zhang JY, Chang C, Yuan JJ, Kang CS, Deng YB, Wang H, Xie XY, Luo BM, Guo SL, Zhou Q, Xue ES, Zhan WW, Jiao T, Zhou Q, Li J, Zhou P, Huang PT, Xue HY, Zhang CQ, Chen M, Jing XX, Gu Y, Guo JF, Ding HY, Xu JF, Chen W, Liu L, Zhang YH, Wang HQ, Mu ZP, Li JC, Wang HY, Jiang YX. Can Combined Screening of Ultrasound and Elastography Improve Breast Cancer Identification Compared with MRI in Women with Dense Breasts-a Multicenter Prospective Study. J Cancer. 2020 Apr 6;11(13):3903–3909. doi: 10.7150/jca.43326.
Wang K, Zou Z, Shen H, Huang G, Yang S, Calcification. Posterior Acoustic, and Blood Flow: Ultrasonic Characteristics of Triple-Negative Breast Cancer.J Healthc Eng. 2022 Sep26;2022:9336185. doi: 10.1155/2022/9336185.
Croshaw R, Shapiro-Wright H, Svensson E, Erb K, Julian T. Accuracy of clinical examination, digital mammogram, ultrasound, and MRI in determining postneoadjuvant pathologic tumor response in operable breast cancer patients. Ann Surg Oncol. 2011 Oct;18(11):3160–3. 10.1245/s10434-011-1919-5.
Evans A, Sim YT, Whelehan P, Savaridas S, Jordan L, Thompson A. Are baseline mammographic and ultrasound features associated with metastasis free survival in women receiving neoadjuvant chemotherapy for invasive breast cancer? Eur J Radiol. 2021 Aug;141:109790. 10.1016/j.ejrad.2021.109790.
Baysal H, Serdaroglu AY, Ozemir IA, Baysal B, Gungor S, Erol CI, Ozsoy MS, Ekinci O, Alimoglu O. Comparison of Magnetic Resonance Imaging With Positron Emission Tomography/Computed Tomography in the Evaluation of Response to Neoadjuvant Therapy of Breast Cancer.J Surg Res. 2022Oct;278:223–232. doi: 10.1016/j.jss.2022.04.063.
Kim JH, Park VY, Shin HJ, Kim MJ, Yoon JH. Ultrafast dynamic contrast-enhanced breast MRI: association with pathologic complete response in neoadjuvant treatment of breast cancer.Eur Radiol. 2022 Jul;32(7):4823–4833. doi: 10.1007/s00330-021-08530-4.
Le-Petross HT, Lim B. Role of MR Imaging in Neoadjuvant Therapy Monitoring. Magn Reson Imaging Clin N Am. 2018 May;26(2):207–220. doi: 10.1016/j.mric.2017.12.011. Epub 2018 Mar 2. PMID: 29622126.
Romeo V, Accardo G, Perillo T, Basso L, Garbino N, Nicolai E, Maurea S, Salvatore M. Assessment and Prediction of Response to Neoadjuvant Chemotherapy in Breast Cancer: A Comparison of Imaging Modalities and Future Perspectives. Cancers (Basel). 2021 Jul 14;13(14):3521. doi: 10.3390/cancers13143521. PMID: 34298733; PMCID: PMC8303777.
Guerini-Rocco E, Botti G, Foschini MP, Marchiò C, Mastropasqua MG, Perrone G, Roz E, Santinelli A, Sassi I, Galimberti V, Gianni L, Viale G. Role and evaluation of pathologic response in early breast cancer specimens after neoadjuvant therapy: consensus statement. Tumori. 2022 Jun;108(3):196–203. 10.1177/03008916211062642.
Vriens BE, de Vries B, Lobbes MB, van Gastel SM, van den Berkmortel FW, Smilde TJ, van Warmerdam LJ, de Boer M, van Spronsen DJ, Smidt ML, Peer PG, Aarts MJ, Tjan-Heijnen VC, INTENS Study Group. ;. Ultrasound is at least as good as magnetic resonance imaging in predicting tumour size post-neoadjuvant chemotherapy in breast cancer.Eur J Cancer. 2016 Jan;52:67–76. doi: 10.1016/j.ejca.2015.10.010. Epub 2015 Nov 30. PMID: 26650831.
Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, Sanduleanu S, Larue RTHM, Even AJG, Jochems A, van Wijk Y, Woodruff H, van Soest J, Lustberg T, Roelofs E, van Elmpt W, Dekker A, Mottaghy FM, Wildberger JE, Walsh S. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017 Dec;14(12):749–62. 10.1038/nrclinonc.2017.141.
Moon WK, Lee YW, Ke HH, Lee SH, Huang CS, Chang RF. Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks. Comput Methods Programs Biomed. 2020 Jul;190:105361. 10.1016/j.cmpb.2020.105361.
Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. IEEE Trans Med Imaging. 2016 May;35(5):1285–98. 10.1109/TMI.2016.2528162.
Xue LY, Jiang ZY, Fu TT, Wang QM, Zhu YL, Dai M, Wang WP, Yu JH, Ding H. Transfer learning radiomics based on multimodal ultrasound imaging for staging liver fibrosis.Eur Radiol. 2020May;30(5):2973–2983. doi: 10.1007/s00330-019-06595-w.
Zhou H, Wang K, Tian J. Online Transfer Learning for Differential Diagnosis of Benign and Malignant Thyroid Nodules With Ultrasound Images. IEEE Trans Biomed Eng. 2020 Oct;67(10):2773–80. 10.1109/TBME.2020.2971065.
Ayana G, Dese K, Choe SW. Transfer Learning in Breast Cancer Diagnoses via Ultrasound Imaging. Cancers (Basel). 2021 Feb 10;13(4):738. doi: 10.3390/cancers13040738. PMID: 33578891; PMCID: PMC7916666.
Bitencourt AGV, Gibbs P, Rossi Saccarelli C, Daimiel I, Lo Gullo R, Fox MJ, Thakur S, Pinker K, Morris EA, Morrow M, Jochelson MS. MRI-based machine learning radiomics can predict HER2 expression level and pathologic response after neoadjuvant therapy in HER2 overexpressing breast cancer. EBioMedicine. 2020 Nov;61:103042. doi: 10.1016/j.ebiom.2020.103042. Epub 2020 Oct 8. PMID: 33039708; PMCID: PMC7648120.
Gu J, Tong T, He C, Xu M, Yang X, Tian J, Jiang T, Wang K. Deep learning radiomics of ultrasonography can predict response to neoadjuvant chemotherapy in breast cancer at an early stage of treatment: a prospective study.Eur Radiol. 2022Mar;32(3):2099–2109. doi: 10.1007/s00330-021-08293-y. Epub 2021 Oct 15. PMID: 34654965.
Magny SJ, Shikhman R, Keppke AL. Breast Imaging Reporting and Data System. 2022 Aug 29. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2022 Jan–. PMID: 29083600.
Giuliano AE, Connolly JL, Edge SB, Mittendorf EA, Rugo HS, Solin LJ, Weaver DL, Winchester DJ, Hortobagyi GN. Breast Cancer-Major changes in the American Joint Committee on Cancer eighth edition cancer staging manual.CA Cancer J Clin. 2017 Jul8;67(4):290–303. doi: 10.3322/caac.21393.
Li F, Yang Y, Wei Y, He P, Chen J, Zheng Z, Bu H. Deep learning-based predictive biomarker of pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer.J Transl Med. 2021 Aug16;19(1):348. doi: 10.1186/s12967-021-03020-z. PMID: 34399795; PMCID: PMC8365907.
van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan RGH, Fillion-Robin JC, Pieper S, Aerts HJWL. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017 Nov 1;77(21):e104-e107. doi: 10.1158/0008-5472.CAN-17-0339. PMID: 29092951; PMCID: PMC5672828.
Dai M, Liu Y, Hu Y, Li G, Zhang J, Xiao Z, Lv F. Combining multiparametric MRI features-based transfer learning and clinical parameters: application of machine learning for the differentiation of uterine sarcomas from atypical leiomyomas. Eur Radiol. 2022 May;18. 10.1007/s00330-022-08783-7.
Kuramoto Y, Wada N, Uchiyama Y. Prediction of pathological complete response using radiomics on MRI in patients with breast cancer undergoing neoadjuvant pharmacotherapy[J]. Int J Comput Assist Radiol Surg. 2022;17(4):619–25. 10.1007/s11548-022-02560-z.
Herrero Vicent C, Tudela X, Moreno Ruiz P, Pedralva V, Jiménez Pastor A, Ahicart D, Rubio Novella S, Meneu I, Montes Albuixech Á, Santamaria M, Fonfria M, Fuster-Matanzo A, Olmos Antón S. Martínez de Dueñas E. Machine Learning Models and Multiparametric Magnetic Resonance Imaging for the Prediction of Pathologic Response to Neoadjuvant Chemotherapy in Breast Cancer. Cancers (Basel). 2022 Jul 19;14(14):3508. doi: 10.3390/cancers14143508..
Qi TH, Hian OH, Kumaran AM, Tan TJ, Cong TRY, Su-Xin GL, Lim EH, Ng R, Yeo MCR, Tching FLLW, Zewen Z, Hui CYS, Xin WR, Ooi SKG, Leong LCH, Tan SM, Preetha M, Sim Y, Tan VKM, Yeong J, Yong WF, Cai Y, Nei WL, JBCR. Ai3. Multi-center evaluation of artificial intelligent imaging and clinical models for predicting neoadjuvant chemotherapy response in breast cancer. Breast Cancer Res Treat. 2022 May;193(1):121–38. 10.1007/s10549-022-06521-7.
Yang M, Liu H, Dai Q, Yao L, Zhang S, Wang Z, Li J, Duan Q. Treatment Response Prediction Using Ultrasound-Based Pre-, Post-Early, and Delta Radiomics in Neoadjuvant Chemotherapy in Breast Cancer.Front Oncol. 2022 Feb7;12:748008. doi: 10.3389/fonc.2022.748008.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Deep transfer learning radiomics based on two-dimensional ultrasound for predicting the efficacy of neoadjuvant chemotherapy in breast cancer

Status:

Version 1

Abstract

Purpose

Methods

Result

Conclusion

Figures

Introduction

Materials And Methods

Patients

Ultrasound examinations and image interpretation

Pathological evaluation and clinical data collection

Image segmentation and feature extraction

Data pre-processing

Radiomics feature selection and radiomics score construction

Statistics analysis

Result

Clinical characteristics

Radiomics feature selection and Rad-Score (DL-Score) development

Discussion

Conclusion

Declarations

References

Additional Declarations

Status:

Version 1