This study was a cross-sectional observational study. The ethics approval of the study has been acquired from the Institutional Review Board of Peking Union Medical College Hospital. Written informed consent was also obtained from the adult patients of the study. And for the patients under 18 year-old, the writtern informed consent was signed by their guadians who accompanied them to receive US examination.
2.1 Patients and imaging
A total of 195 focal breast lesions from the patients aged between 15-82 years, with a mean age of 45.7 years and a median of 45.0 years, were enrolled consecutively in this study.
The inclusion criteria for the study were as follows:
(1) palpable masses verified by breast imaging; and
(2) nonpalpable masses found by breast imaging, with or without other symptoms;
The exclusion criteria were as follows:
- biopsy of the breast lesions performed before US examinations;
- pregnancy, or in lactation;
- neoadjuvant treatment;
- only simple cysts visible on US images;
- no evident focal breast lesions suitable for CAD evaluation
The patients underwent US examinations before they received further treatment. All lesions were biopsied and had a final pathological diagnosis. The pathological results were deemed the gold standard for the study.
The patients received standard bilateral breast US scans performed by an experienced radiologist. A commercial US unit (RS85, Samsung Medison Co., Ltd., Korea) equipped with a L3-12Ahigh-frequency linear probe (3-12 MHz) and the CAD software S-Detect™ for Breast was utilized.
2.2 Study protocol
2.2.1 Image assessment of S-Detect™ for Breast and the five in-training residents
A single grayscale US image demonstrating the lesion with the maximum size was manually selected for S-Detect™ for Breast analysis. First, the radiologist clicked the center of the target mass, and the contour of the lesion was segmented by S-Detect™ automatically. The outline of the lesion was adjusted manually by the radiologist when necessary. Then, the classification of each lesion in a dichotomic form (possibly benign and possibly malignant) was provided by S-Detect™. US descriptors extracted by S-Detect™ were also displayed, including shape, orientation, margins, pattern and posterior acoustic features.
Five in-training residents with 1-3 years of working experience were invited to assess the US lesions independently. All images of the lesions (including grayscale, color Doppler flow and elastography images) were retrospectively reviewed by five in-training residents, and they were asked to classify the lesions based on BI-RADS lexicon. The residents were blinded to S-Detect™ and pathology results. R1-5 was used to represent the five residents. R1, R2 and R3 were third-year residents, and each had one-year of experience with breast US. R4 and R5 were second-year residents, each with six months of experience with breast US. The five residents had all received a standard training program for breast US, and have passed the exams of basic US organized by our medical center.
A cutoff value was set at category 4 to transform the residents’ results into a dichotomic form. Category 2 and 3 lesions were deemed as possibly benign, and category 4 and 5 were considered possibly malignant. The diagnostic performances of S-Detect™ and the five residents were evaluated, and comparisons were made between S-Detect™ and the residents.
2.2.2 Integration of the result of the five residents and S-Detect™ for Breast
To evaluate the potential of S-Detect™ to help improve the diagnostic accuracy of residents, the results of the five in-training residents were integrated with those of S-Detect™ in category 4a lesions. We compared the results of S-Detect™ and those of the residents for each lesion. If the lesion was diagnosed as category 4a by the residents but possibly benign by S-Detect™, the decision of S-Detect™ was adopted, thus downgrading category 4a lesions to the possibly benign group. Due to the high sensitivity of the residents presented in the preliminary experiments, we did not change the category 3 lesions when they were classified as possibly malignant by S-Detect™. In addition, the rest of the classifications made by the residents remained unchanged.
Diagnostic performances of the integrated results were calculated, and compared with the original results of the residents without S-Detect™. Interrater variability before and after integration with S-Detect™ was assessed using intraclass correlation coefficients (ICCs).
2.3 Statistical analysis
The diagnostic performances of the residents, S-Detect™ and the integrated results of the residents and S-Detect™ for category 4a lesions were evaluated using the sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), positive predictive value (PPV), negative predictive value (NPV), receiver operating characteristic (ROC) curve and area under the receiver operator characteristics curve (AUC). In addition,2 × 2 contingency tables were delineated to measure these indicators. We made comparisons of sensitivity and specificity between residents using the chi-square test. The AUC values were compared using the Z test.
ICC with 95% confidence intervals was calculated to evaluate the interrater variability of multiple raters. In this study, each subject was rated by the same raters, and ICC was deemed the absolute agreement of the raters, as the systematic differences among the raters were relevant. ICC value was interpreted as follows.
Poor agreement: ICC< 0
Slight agreement: 0<ICC<0.20
Fair agreement: 0.20 <ICC<0.40
Moderate agreement: 0.40 <ICC<0.60
Substantial agreement: 0.60 <ICC<0.80
Perfect agreement: 0.80 <ICC< 1
Statistical significance was considered when the p-value was less than 0.05. SPSS software (IBM, SPSS 21.0) and Medcalc (MedCalc software, version 15, Ghent, Belgium) were utilized in the study.