In this study, we developed and evaluated six machine learning predictive models (RF, LR, XGB, KNN, SVM, and CNB) for delays in seeking medical care among breast cancer patients. Among these models, the RF model demonstrated the highest prediction performance, with an effective AUC value (Table 2). The AUC values in the training set ROC, validation set ROC, and external verification ROC curves were 1.00, 0.87, and 0.76, respectively. Presently, only the study by Samira et al. developed a machine learning prediction model for breast cancer delays in seeking medical care in the Iranian population [19]. The study by Samira et al. also demonstrated that the RF model has the best prediction performance. However, Samira et al. only performed internal verification, and the evaluation indicators of the RF model shown in internal verification were weaker than those of the model developed in this study (Accuracy, 0.70 vs. 0.79; Sensitivity, 0.837 vs. 0.77; Specificity, 0.361 vs. 0.86; AUC, 0.788 vs. 0.87). In addition, this study also conducted external validation using a separate sample, which yielded an AUC value of 0.76. Furthermore, the study utilized the SHAP approach to visualize the model, addressing the "black box" issue commonly associated with machine learning predictive models. The results of this study confirmed that the predictive model for delays in seeking medical care for breast cancer developed for the Chinese population has better external generalizability.
This study demonstrated that the prevalence of patient delay was 39.3%, which was similar to those reported in previous studies [8]. Breast cancer patients may be influenced by various factors that lead them to disregard their problems and postpone seeking medical care. SHAP analysis revealed the significant predictors of delays in seeking medical care among breast cancer patients, including, distance from the hospital, physical examination status, hospital choice, health value, education level, medical payment method, preferred solution for breast discomfort, and religion. And these eight variables were readily available in both hospital and community settings, and involved almost no cost, which facilitated the subsequent promotion and application of the model.
The Fig. 7 showed that the distance from hospital was the most important predictor of seeking medical care among patients with breast cancer. Studies have confirmed that access to health care was an important factor in delaying medical treatment for breast cancer patients [20, 21]. And studies have also confirmed that living in rural areas was an important risk factor for delays in seeking medical care of breast cancer [22, 23]. Hence, improving the accessibility of healthcare services is crucial for reducing delays in seeking medical care for patients with breast cancer.
Our study found that physical examination status was the second most significant predictor of delays in seeking medical care among patients with breast cancer. In this study, 34.81%(188/540) of the patients have not undergone any physical examination, and among those who have never undergone a physical examination, 55.85%(105/188) experienced delays in seeking medical care. A study noted that in the physical examination of malignant tumors in healthy people [23], the more the subjects knew about malignant tumors, the more likely they were to seek medical consultation early, and the lower the possibility of delayed medical treatment, which is consistent with the results obtained in this study.
Educational level and health value were the main indicator affecting seeking medical care. Other studies have indicated that education level is also associated with delayed presentation, which reported findings similar to our study [2, 24, 25]. Individuals with lower levels of education tend to exhibit prolonged delays in seeking medical care. Moreover, from the collected questionnaires, most of the people with lower education levels had lower health values. Therefore, improving education levels and enhancing the dissemination of knowledge related to breast cancer are crucial for reducing delays in seeking medical care in breast cancer.
The preferred solution for breast discomfort was also correlated with patient delay in this study, which is consistent with the findings of Ren et al [26]. Our study has confirmed the distance from hospital was the most important factor contributing to the delay in seeking medical attention for breast cancer patients. When patients felt breast discomfort, they preferred to visit a small clinic. The reason may that women play an important role in caring for children and families in China, and may prefer small clinics close to home to buy medicine and may only visit hospitals when serious symptoms appear. However, due to the poor diagnosis and treatment level of the clinic, some patients obtained the wrong symptom explanation, resulting in a delay.
The medical payment method was an important predicted indicator affecting seeking medical care in our study. The lack of health insurance and limited access to medical services contribute to patients' delay in seeking necessary medical attention. Rauscher et al [20]. also found that the availability and utilization of health care services strongly predicted patient delays in seeking medical care. Patients with medical insurance and regular visits to doctors were less likely to experience delays. Nelissen et al [21]. observed that individuals tend to assess their available health care resources before seeking medical help. Factors such as the absence of medical insurance, difficulties in accessing medical services, and complex referral processes all act as barriers for patients seeking timely medical assistance.
Our study identified that religion was an important factor related to delays in seeking medical care patient. In this study, the proportion of breast cancer patients with religious beliefs experiencing delays in seeking medical care was as high as 81.25%, which was closely related to the research setting of this study. This study was conducted in a tertiary first-class oncology specialized hospital in Sichuan, which treated a majority of cancer patients from the Sichuan region. Sichuan had a very unique geographical location, and the majority of the population with religious beliefs was Tibetan. Tibetans mostly resided in remote plateau areas where medical resources were limited, and the overall educational level among the Tibetan people tended to be relatively low. Ma et al [27]. also identified that religious belief was a risk factor for delays in seeking medical care in women with cervical cancer, which was similar to the findings of our study.
Limitations
Although the model developed in this study demonstrated excellent discrimination ability, calibration, and clinical effectiveness, there are several limitations. First, the data was derived from self-reports obtained through questionnaires, which may have introduced bias. Second, this study included data from only a single center, and the sample size was small, which may have led to distribution bias and may not be ideal for validating datasets in model development. Therefore, it is essential to obtain multicenter clinical data to provide more reliable theoretical guidance for clinical practice. Additionally, it is necessary to update the prediction model and enhance its performance.