Participant and Imaging Characteristics
In total, 186 lesions from 179 female participants (mean age, 48 ± 11 years) with 115 malignant lesions and 71 benign lesions were enrolled after excluding 177 lesions (Figure 1) from 17 tertiary centers across China, allowing for the inclusion of a diverse patient population. The mean age (SD) of the overall cohort was 49 ± 11 years, and all 179 of the patients were women (Table 1). The details of overall pathologic distribution and of each cohort are presented in Supplementary Table 5. The frequencies of the PFB-CEUS and MP-MRI characteristics in the two data cohorts are described in Supplementary Table 6.
Model Development and Performance
The univariate and multivariable logistic regression analysis of PFB-CEUS, MP-MRI, and the Hybrid model is shown in Supplementary Table 7, 8, and 9 respectively.
In the external validation model, The hybrid model showed an AUC greater than that of both MP-MRI and PFB-CEUS and with AUC 0.92 ([95% CI: 0.77, 0.98]).the PFB-CEUS model was shown to have an AUC 0.89 ([95% CI: 0.74,0.97]), which was better than that with the PFB-CEUS BI-RADS (AUC 0.80 [95% CI: 0.63,0.92), P =.43; Luo models (AUC 0.74 [95% CI: 0.57,0.87], P =.01)(24); Chen model: (AUC 0.69 [95% CI: 0.51, 0.83], P =.02)(25); and Yukio model (AUC 0.70 [95% CI:0.53, 0.85], P =.01)(13). In addition, The MP-MRI model was shown to have an AUC 0.89 ([95% CI: 0.73, 0.97]), which was better than that with the ACR BI-RADS models (AUC 0.73 [95% CI: 0.55, 0.87], P =.15). (Supplementary Table 10, Figure2A).
The PFB-CEUS, MP-MRI and hybrid models were well-calibrated and all showed statistical significance (P >.05) in the Hosmer-Lemeshow test. The calibration plot are showed in Figure 2B. The decision curve are showed in Supplement Figure 2. In addition, the respective dynamic nomograms of the MP-MRI, PFB-CEUS, and hybrid models are shown in the following links: https://ceus.shinyapps.io/PFB-CEUS/ , https://ceus.shinyapps.io/MP-MRI/, https://ceus.shinyapps.io/Hybrid/. Figure 3A shows a matched example in which the PFB-CEUS and MP-MRI nomograms showed a high malignant probability of breast cancer. Figure 3B shows a matched example in which the PFB-CEUS showed a high malignant probability of cancer but MP-MRI nomograms showed a low malignant probability of cancer which the lesion was diagnosed as invasive carcinoma by pathology. Figure 3C showed a matched example in which the PFB-CEUS and MP-MRI nomograms showed a high malignant probability of cancer but Hybrid nomograms showed a low malignant probability of cancer which the lesion was diagnosed as hyperplasia by pathology.
Comparison among the three models
PFB-CEUS vs. MP-MRI model
The PFB-CEUS model showed a comparable discrimination ability for diagnosing breast cancer, with the MP-MRI model showing the same not only in the development cohort ((AUC 0.90 [95% CI: 0.84, 0.94]) vs. (AUC 0.90 [95% CI: 0.85, 0.95], P =.80) but also in the validation cohort (AUC 0.89 [95% CI:0.74, 0.97]) vs. (AUC 0.89 [95% CI: 0.73, 0.97], P =.85) (Supplementary Table 10).
PFB-CEUS vs. Hybrid model
The hybrid model had a higher capacity to diagnose breast cancer compared with the PFB-CEUS model ((AUC 0.95 [95% CI: 0.90, 0.98]) vs. ((AUC 0.90, [95% CI: 0.84,0.94]), not only in the development cohort; P =.01) but also in the validation cohort ((AUC 0.92 [95% CI: 0.77, 0.98]) vs. (AUC 0.89 [95% CI: 0.74, 0.97]), P =.29) (Supplementary Table 10).
MP-MRI vs. Hybrid model
The hybrid model demonstrated a higher capacity for diagnosing breast cancer compared with the MP-MRI model ((AUC 0.95 [95% CI: 0.90, 0.98]) vs. (AUC 0.90 [95% CI: 0.85, 0.95]), respectively; P =.078), not only in the development cohort but also in the validation cohort ((AUC, 0.92 [95% CI: 0.77, 0.98]) vs. (AUC 0.89 [95% CI: 0.73,0.97]), respectively; P =.401) (Supplementary Table 10).
Model performance in sub-populations
Of all the subgroups dichotomized by age and menstrual status, the hybrid model demonstrated a higher AUC for diagnosis than the PFB-CEUS model for patients younger than 49 years of age ((AUC, 0.96 [95% CI: 0.89, 0.99]) vs. (AUC 0.86 [95% CI: 0.76,0.93]), respectively) and premenopausal ((AUC, 0.94 [95% CI: 0.87,0.98]) vs. (AUC 0.86 [95% CI: 0.77,0.92]). No significant difference in AUC was noted between other comparisons of modality (Supplementary Table 11). The diagnostic results for lesions with high-risk malignancy (a lesion that would be appropriate for surgical consultation) including intraductal papilloma, hyperplasia, phyllodes tumor are presented in Supplementary Table 12.
Model performance for diagnosing breast cancer compared with radiologists
BI-RADS 4A, 4B, and 4C were used as the cut point for comparison.
PFB-CEUS
When compared with the on-site radiologists, the PFB-CEUS model achieved higher sensitivity at the BI-RADS 4B+ and 4C+ modes with lower specificity at all modes, as well as the hybrid model. When compared with the three senior reviewers, the PFB-CEUS model achieved higher sensitivity at the BI-RADS 4B+ and 4C+ modes, with lower specificity at all three modes. However, the hybrid model achieved higher sensitivity at all three BI-RADS modes, and higher specificity at the BI-RADS 4C+ mode (Table 2).
MP-MRI
When compared with the on-site radiologists, the MP-MRI model achieved higher sensitivity at all three modes with lower specificity at all three modes, as well as the hybrid model. When compared with the three senior reviewers, the MP-MRI model achieved higher sensitivity at all three modes, higher specificity at the BI-RADS 4C+ mode. The hybrid model achieved higher sensitivity at all three BI-RADS modes, and higher specificity at the BI-RADS 4B+ and 4C+ mode (Table 2).
Model Performance with False-positive and False-negative Correction Rate
PFB-CEUS
The false-positive correction rate of the PFB-CEUS and hybrid models was 80.6% and 90.3% for on-site results, and 82.2% and 88.9% for the reviewers’ results based at the BI-RADS 4A+ modes, respectively. The false-negative correction rate of the PFB-CEUS and Hybrid model was 66.7% and 66.7% for on-site results, and 83.3% and 83.3% for reviewers’ results at the BI-RADS 4A+ modes, respectively (Supplementary Table 13) (Figure 4).
MP-MRI
The false-positive correction rate of the MP-MRI and Hybrid model was 77.7% and 86.1% for on-site results, and 83.0% and 90.5% for reviewers’ results at the BI-RADS 4A+ modes, respectively. The false-negative correction rate of the MP-MRI and Hybrid models was 50.0% and 0.0% for on-site results, and 57.1% and 42.8% for reviewers’ results at the BI-RADS 4A+ modes, respectively (Supplementary Table 13) (Figure 4).