To our knowledge, this is the first study in which methods to predict HER2 status were explored for surgical specimen patients and to evaluate trastuzumab efficacy in biopsy specimens of HER2-positive GAC patients that had undergone initial trastuzumab treatment combined with chemotherapy. Our DL algorithm exhibited high sensitivity and specificity while using only a small sample size, thus confirming the feasibility and effectiveness of our research. In this study, we introduced a CNN trained on histological features of primary GAC tissue that can accurately predict HER2 status and assess the response to trastuzumab treatment. Additionally, we conducted ablation experiments to demonstrate the efficiency of our algorithm compared to other state-of-the-art methods.
DL, a subset of artificial intelligence (AI), can handle a large amount of patient data and recognize patterns in a diverse range of applications and industries. Recently, DL has been applied to achieve image classification in the medical imaging and computer vision fields, with particular emphasis on applications in computed tomography (CT) scans, magnetic resonance imaging (MRI), and pathological images [17, 18].
Recently, there has been a notable increase in the application of DL algorithms in the field of pathology, especially for cancer classification and estimating molecular features and cancer subtypes such as EBV and MSI status in GAC. Patient-level AUC results for EBV range from 0.8723 to 0.969 [9, 19], and for MSI, they range from 0.770 to 0.8848 [10, 20]. DL models have thus shown significant efficacy in directly identifying features, overcoming the subjective bias inherent in extracting features from H&E-stained slides.
Assessing HER2 status is of vital importance in guiding clinical decisions across various cancer types. However, the widespread use of HER2 testing is limited by financial constraints associated with FISH and IHC. In HER2-positive GAC, intratumoral HER2 heterogeneity prevalence ranges from 45–79% by IHC and 23–54% by ISH, exceeding the rates observed in breast cancers. This variability may be associated with the diversity of GAC tissues and the biological characteristics of tumors, which may have implications for treatment efficacy and patient prognosis [21, 22]. Hence, there is an immediate requirement for a model capable of accurately predicting HER2 status in GAC.
Several studies have investigated the correlation between breast cancer HER2 status and therapeutic efficacy using DL methods [23, 24] but not for GAC. This paper presents a model to explore the potential efficacy of DL in predicting HER2 status levels in GACs, introducing an algorithm that utilizes morphological features for HER2 status prediction. In the process of model development, we evaluated the classification performance of various DL networks and found CLAM demonstrated the highest performance. Furthermore, a comparative analysis of various classification learning networks identified CLAM as the most successful classifier. Further training using a contrastive learning model led to a substantial 1.73% increase in average AUC for classification compared to a single CLAM classifier, with the accuracy improving by 1.60%.
Our findings indicate that the transformer and DL models resulted in a higher AUC value compared with that achieved by the models developed by Bychkov (0.847 vs. 0.70) [23]. Furthermore, Bychkov's research employed 712 WSI cases for training, whereas we utilized a smaller sample size of 300 WSI cases. Despite the disparity in sample size, our algorithm demonstrated superior accuracy in predicting HER2 status. Currently, cases scored as HER2 2 + by IHC require additional FISH testing to confirm HER2 status. The proposed algorithm achieved a high AUC of 0.903 and an accuracy of 0.869 in predicting HER2 status for this subset of cases. Hence, we aimed to predict HER2 status specifically for cases with an IHC score of 2+. Nevertheless, it is important to note that the sample size in the current study was limited, and future research with larger sample-size cohorts is needed to improve the reliability of the algorithm.
The efficacy of trastuzumab combined with chemotherapy in improving the survival rate of HER2-positive metastatic GAC patients has been demonstrated in previous studies [25]. However, the significant heterogeneity of GAC presents challenges to achieving optimal outcomes with this treatment approach. We trained a classifier using pretreatment samples obtained from patients with documented trastuzumab responses, achieving an AUC value of 0.833 by 10-fold cross-validation. To our knowledge, This study represents the first reported application of CNN-based DL algorithms to predict trastuzumab efficacy in GAC patients, showing the value of predicting anti-HER2 response efficacy.
However, there are several limitations to this study that we plan to address in future work. First, our experiments were deficient in predicting HER2 status in GAC biopsy specimens and lacked prospective samples for validation. Future efforts will involve collecting biopsy specimens and prospective samples to further optimize the model for clinical applicability. Second, incorporating multicenter data can enable the model to learn more diverse features and enhance its generalizability. We will focus on collecting additional cases from other centers to improve the generalization capability of a model. Finally, future work will combine current imaging data with clinical data (such as tumor manifestations and circulating tumor markers) or multimodal features (such as radiomic features) to construct a multimodal DL model. Additionally, we should explore more intuitive visualization methods to elucidate the black-box properties of HER2 positivity and the efficacy of trastuzumab treatment in GAC.