This study emphasized the importance of developing a highly accurate ML-based ND prediction model that can be universally applied to adults with T2DM in South Korea. This study provided a simple and precise assessment of the future annual risk of ND in the national diabetic population. The AdaBoost, LightGBM, Random Forest, and XGBoost ensemble models showed excellent performance with AUROC values ranging from 0.79 to 0.82 on the discovery dataset and 0.79 to 0.83 on the external validation dataset. Age and cardiovascular disease were the top 15 factors affecting the feature importance. The results of this study can potentially improve patient outcomes by enabling timely intervention, advancing the comprehension of contributing variables, and reducing the burden of neurodegenerative complications in patients with T2DM.
This study was based on a large cohort of the Korean population and used data from three university hospitals. Multiple variables, such as anthropometric variables, medical history, medication use, and laboratory tests, were used for model development. An advantage of this study is that long-term follow-up data of approximately three years were available for outcome evaluation. This ND prediction model is meaningful because it demonstrates sufficiently good performance, with a mean AUROC of 0.82, using only questionnaires, body measurements, and blood tests commonly conducted in clinical practice for patients with diabetes.
Our findings provide insights into the metrics that can be used in primary care for ND prediction. Existing biomarkers for ND include neuroimaging, such as brain MRI and SPECT/PET, or cerebrospinal fluid testing, and biomarkers using blood samples, such as high-sensitivity C-reactive protein (hsCRP), GGT, homocysteine, apolipoprotein E, and uric acid 23,24. Although they can be used as adjuncts to increase diagnostic confidence, most are expensive or invasive and are not recommended as routine diagnostic tests in clinical practice.
According to feature importance analysis, age, cardiovascular disease, cancer, neuropathy, and ALP levels were among the top five predictors of ND. The association between age, cardiovascular disease, and neuropathy with ND was consistent with the results of previous studies. Age is a conventional risk factor for ND 25. Cardiovascular disease is a known risk factor for ND 26. Cardiometabolic risk factors such as diabetes, hypertension, and hyperlipidemia were also consistently associated with the risk of developing ND 27. The association between peripheral neuropathy and ND in this study is consistent with its association with the development of MCI and dementia in the general and diabetic population 28. Meanwhile, the relationship between cancer and ND is likely to be inverse according to previous studies. The incidence of cancer is reportedly lower in patients with ND 29. It is important to note that aging also affects the occurrence of cancer 30, and this study did not adjust for the effect of aging on cancer; therefore, further research is needed to determine the causality between cancer itself and ND. The results related to ALP levels were consistent with previous reports showing that ALP levels were increased in AD patients 31. In contrast, some studies have found no significant association between ALP and PD 32. This may be related to increased bone ALP, as PD is associated with an increased incidence of osteoporosis, falls, and fractures 33. Moreover, the association between ALP variability and ND development has not been previously studied and warrants further investigation.
While some studies have shown an increased incidence of ND with long-term exposure to metformin 34, conflicting studies have suggested that metformin has a therapeutic potential for ND 35. Because metformin users may have more hyperglycemia than non-users, it is difficult to conclude that metformin use worsens the risk of developing ND. However, few studies have investigated the association between meglitinide use and ND. In one study, meglitinide showed a significant protective effect against dementia in combination therapy rather than in monotherapy 36. Because meglitinide is often used in combination with agents such as metformin rather than as a monotherapy, and in patients with diabetes who are not glycemically controlled despite multidrug therapy, there may be more meglitinide users among those who develop ND due to hyperglycemia. However, the number of meglitinide users was too low to confirm this association.
Given that ARBs and CCBs are the first and second most prescribed drugs for hypertension in Korea as monotherapy, and the combination of ACEi/ARBs and CCBs is the first most prescribed drug in two-drug therapy 37, CCBs are ranked higher in feature importance for the development of ND than ARBs. The preventive effect of CCBs on ND has been recognized in epidemiologic studies 38, and it is known that specific calcium channel subtypes are implicated in the pathogenesis of PD and that dihydropyridine CCBs with selectivity for these ion channels have a neuroprotective effect in animal models 39. Although some conflicting studies have shown that antihypertensive drugs are not associated with ND 40, the results of this study show that CCB is effective in preventing ND.
Aspirin has previously been shown to reduce the incidence of AD and PD, as well as cardiovascular events and cancer. Aspirin-medicated acetylation prevents several neurodegenerative pathologies by interfering with protein aggregation.41 Cilostazol has been shown to have a neuroprotective effect against vascular dementia in mice induced by L-methionine 42. It is unclear whether the protective effects of aspirin and cilostazol against ND are due to an indirect lowering of the incidence of ND because of their preexisting effects on reducing the risk of cardiovascular disease, another risk factor for ND, or whether they directly affect the pathological mechanisms of ND.
This study has several limitations. First, due to the retrospective nature of the study, obtaining accurate information from a dataset based on hospital medical records was difficult. Second, the performance of this prediction model was not compared with that of other existing prediction models. In addition, other rare diseases, such as multiple sclerosis, Huntington's disease, and amyotrophic lateral sclerosis, were not included in the ND outcome. Finally, this study, on its own, cannot prove a causal relationship between the predictors used in the model and the incidence of ND. Further experimental research is needed to clarify the biological pathways and demonstrate the mechanisms of interaction between variables related to ND and their impact on the development of ND.
In this study, we developed an ML-based prediction model using a representative national cohort. The model accurately predicted the risk of ND in all members of the Korean population with T2DM. We also demonstrate that the performance of several ML models is satisfactory. The AdaBoost model performed the best (AUROC 0.82 in the discovery dataset and AUROC 0.83 in the validation dataset). This study is the first to apply an ML-based ND prediction system to a national population with diabetes. The implementation of evidence-based individualized preventive interventions may decrease the burden of ND in South Korean patients with diabetes. The prediction model proposed in this study is expected to be competitive and cost-effective in preventing ND in Korean patients with T2DM and is expected to be widely used, especially in primary care settings.