To assist in the targeted adjuvant therapy for HER2-positive and HER2-low BC, this study developed a multimodal method based on breast US images with clinical parameters to assist in the HER2 status prediction in BC patients. The HPN demonstrated better diagnostic performance in distinguishing HER2 status compared to any unimodal method. Encouragingly, our model exhibited favorable discrimination ability between HER2-low and HER2-positive patients (AUCHER2-positive=0.87, AUCHER2-low=0.81). Previous studies on predicting HER2 status have reported AUC values ranging from 0.81 to 0.87(23),(24),(14). Compared with these studies, in HPN, Transformer-ResNet18 fusion modal has achieved a favorable diagnostic performance (AUCoverall=0.86) for HER2 status prediction. HPN has the potential to serve as a non-invasive imaging biomarker to complement postoperative pathology and ISH, improving the noninvasive diagnostic accuracy of HER2 status.
Most of the current studies on HER2 expression prediction are based on the intra-tumoral imaging features of BC primary foci, neglecting the peri-tumoral imaging features that may contain information about the tumor microenvironment (25). Evidence showed that peritumor imaging histological features could characterize the BC tumor microenvironment and that biological changes in the tissue surrounding a breast mass may be potential predictive or prognostic markers(26). This study is the first to develop a multimodal model for predicting HER2 status based on the peritumor region of breast US. Intratumoral, 1.2 x, 1.4 x, and 1.6 x peritumoral range were compared, and the results confirmed that expanding the detection of a 1.6 x peritumoral range improved the predictive efficacy of HER2-low BC (AUC HER2-low from 0.77 to 0.81), whereas there was no significant increase in the overall predictive efficacy for HER2 expression. This may since US contains less sequence and image information compared to MRI(27). However, the results also suggested that the HPN can accommodate the range of diverse ROIs and maintain the stability of HER2 expression prediction. In comparison with other studies, it has been found that peritumoral texture features on DCE-MRI 9 to 12 mm from the outermost edge of the primary lesion better identified patients with HER-2-positive BC than intratumoral texture features, with an AUC of 0.85 (26). Further studies found that DCE-MRI-based peritumor (4 mm outward from the tumor margin) imaging was valuable in predicting HER-2 expression, and that models combining intratumor and peritumor imaging had higher diagnostic efficacy than intratumor or peritumor imaging alone(27).
There are still some problems in combining intratumoral and peritumoral BC radiomics, such as unstable features due to various ROI outlining modes, and large differences in model prediction performance due to different modeling methods. In terms of the criteria and extent of peritumoral augmentation, many studies have now utilized the basic method to center the breast mass and expand the corresponding diameter as the peritumoral area(28). However, while too small a peritumoral extent may miss some of the tissue information containing the tumor microenvironment of BC, too large a peritumoral extent may incorporate too much normal tissue, thus affecting the quantification of the tumor microenvironment. In future studies, it is more important to determine the cut-off value of the amplification range based on the surgical margins of BC, thereby exploring the optimal peri-tumor range and guide clinical practice. The quantitative analysis of combined intratumoral and peritumoral radiomic features has potential application in the prediction of HER2 expression in BC and provides a useful aid for therapeutic decision-making.
Some studies reported that some histopathological data such as tumor grade, lymphovascular invasion(LVI), Ki-67 and hormone receptor status may correlated with HER2 status(23), (24), (14). However, due to the limitations of using isolated pathological or imaging features for a comprehensive characterization of breast tumors, this study further investigated a multimodal model combining radiomics and clinicopathological data. This study is the first to propose a multimodal neural network model(HPN) for HER2 status prediction by US images and clinical features. HPN extracted image features by using ResNet as an image encoder, and structured encoding of clinical data. The image features were downscaled by PCA, after which the two modal features were fused. Finally, Transformer was used to construct a classifier to predict the HER2 status. The results showed that the HPN model demonstrated significant advantages over algorithms like Mask R-CNN, Cascade R-CNN, MS-RCNN, and ConvNeXt(29). Firstly, the Transformer's self-attention mechanism effectively captures long-range dependencies across the entire image, which is crucial for identifying subtle and dispersed features in BC US images(30). Moreover, unlike traditional convolutional neural networks(CNN) that rely on convolution operations, the highly modular structure of Transformers offers exceptional flexibility, making them more effective in handling deformed, irregular, and complex structures(29). This is particularly suitable for analyzing irregular and indistinct boundaries in BC US images. Although Transformers typically require large datasets for training, their strengths in pre-training and transfer learning enable them to perform well even with limited data. Similar results were observed when using Transformers to predict complete response in BC patients who received NAC (AUC=0.910)(31).
ResNet18 is a deep learning model known for its residual structure, which mitigates the vanishing gradient problem in deep networks. Compared to other ResNet variants, ResNet18 offers a simpler structure with lower computational requirements, making it efficient and suitable for resource-limited applications(32). In BC radiomics, ResNet18 has been effectively applied to classify breast images, demonstrating its ability to accurately identify features and improve diagnostic accuracy. For example, in MG radiomics, the ResNet18 model has shown excellent performance in predicting malignant microcalcifications in BC patients (AUC = 0.88)(33). Previous studies have documented the diagnostic performance of various machine learning algorithms in the context of BC(23),(24),(14). Random Forest (RF) integrates multiple decision trees to automatically assess feature importance, aiding in feature selection and model interpretation. In BC radiomics diagnosis, RF has shown unique advantages, with studies indicating an AUC of up to 0.929 in classification of benign and malignant breast lesions. This approach mitigates the risk of overfitting and enhances prediction accuracy and reliability. Additionally, the independent growth of decision trees facilitates parallel processing, accelerating training speed. These attributes make Random Forest highly effective for tasks such as classification, regression, and feature selection(34).
Some limitations have to be addressed in this study. The imbalance of dataset and single-center data restricted the performance of HPN. GANs might be a potential method for image generation and conduct multi-center studies to improve generalization and diagnostic efficacy(35)(36). The US images in this study were sourced from one manufacturer (Philips). More multiple manufacturers should be included in future models. Advanced multimodal models combine with mammography, MRI, and pathological data, can enhance performance. Combining deep learning-based tumor detection with radiomics can develop comprehensive clinical diagnostic software.
In conclusion, combining clinicopathological parameters with conventional US, HPN has offered a non-invasive and practical method for preoperatively HER2 expression prediction. It has also held the potential to assess HER2-low expression BC patients, thereby determining appropriate ADC drug treatment regimens. Future modeling of larger samples and multimodal data is needed to improve the accuracy of predictions. Prospective multi-center validation is expected to provide high-level evidence for the subsequent clinical application.