Sarcopenia and MCI are never two parallel lines. They share numerous common pathogenic mechanisms and causative factors. A systematic evaluation of the prevalence of sarcopenia and MCI concluded that the overall prevalence of MCI in patients with sarcopenia was 20.5%, with high heterogeneity. In contrast, the overall prevalence of sarcopenia in patients with MCI was 9.1% [28], which suggests that the prevalence of MCI in patients with sarcopenia is relatively high and that sarcopenia may be a risk factor for MCI. Secondly, several longitudinal studies have shown a significant association between sarcopenia and MCI, and the components of sarcopenia, especially grip strength and gait speed, can be used as predictors of MCI for early prediction and diagnosis of the disease. Therefore, we mainly utilized the body measurements related to sarcopenia for risk prediction of MCI. The variables included were the indicators associated with the onset of MCI, confirmed in previous studies and easily measured daily. Since our model was biased towards clinical consultation, the indicators of complex laboratory tests related to the occurrence of MCI were not included in our inclusion.
Our prediction of MCI in the specific population of sarcopenia, in addition to the high prevalence of MCI, is related to dementia, a severe adverse outcome caused by MCI. First, studies have shown that effective post-intervention can halve the risk of MCI in the next five years in patients with sarcopenia [11], which demonstrates the importance of screening people at high risk for MCI and providing early prevention. Second, 14.4%-55.6% of patients with MCI may regain neurologic integrity [3], suggesting that early diagnosis of MCI is critical. Risk screening and early diagnosis and intervention may significantly reduce the incidence of dementia and reduce the medical and social burden. Cognitive impairment due to sarcopenia differs from other causes or primary cognitive impairment in that they have a particular physical burden that may not be amenable to interventions such as exercise to prevent or mitigate the progression of the disease. Therefore, we hope to use our model to accurately and promptly determine the risk of developing MCI. Further studies can set up personalized intervention programs for patients with sarcopenia who are at high risk of developing MCI and dementia to achieve precise intervention.
Regarding variable inclusion in the model, age and gender are considered variables strongly associated with the onset of MCI. The reason that age was not included in the model may be related to the more concentrated age of the sample. Factors such as ASM and SMI were also included in the model and were calculated from variables such as age and gender. At another level, age, gender, etc., although not included in the DL model, their interactions with other variables are also crucial features that make up the model. Secondly, studies have shown that grip strength and gait speed are also significant predictors of MCI, and the reason for not being included in the model may be related to the definition of sarcopenia. According to the AWGS criteria, muscle mass is used as the primary condition for diagnosing sarcopenia, and muscle strength or physical performance can be diagnosed as sarcopenia if only one is fulfilled. Hence, indicators such as grip strength and gait speed are more variable among individuals, and the model can then not screen out these variables as essential variables. However, based on the definition of severe sarcopenia, where low muscle mass, grip strength, and physical performance co-exist, grip strength and gait speed will probably be essential variables in the model. We were unable to construct a model to predict severe sarcopenia due to the limited nature of our sample, and this will be the focus of our research in the future as well. Third, diabetes itself can lead to cognitive impairment. In patients with sarcopenia, diabetes may be a risk factor for accelerated MCI or dementia.
DL is a new research direction in ML that learns the intrinsic patterns and levels of representation of sample data. Neural networks discover distributed feature representations of data by combining low-level features to form more abstract high-level representations of attribute categories or features. Its essential feature is to try to mimic the pattern of transmitting and processing information between neurons in the brain by designing and establishing appropriate neuron computation nodes and multi-computing hierarchies, selecting proper input and output layers, and through learning and tuning of the network, establishing the input-to output functional relationship. The purpose of our choice of DL modeling is to take advantage of neural networks to learn and infer higher-order nonlinear associations between clinical features and patient outcomes in an entirely data-driven manner [29] and to deeply analyze the effects of variables and interactions between variables on outcomes, which is something that cannot be achieved by ordinary regression, and also, as shown in the results, compared with the basic regression, the feed-forward neural network does show better prediction results. In addition, with the continuous development of DL, its application in the medical industry is becoming more and more extensive. In the future, we can also use DL methods to develop disease-personalized interventions and care for patients.
The reason for the relatively significant difference in AUC values between the base regression model and the DL model may be the complexity of the neural network. The neural network can decrease the error along the gradient by the strength of the connection between the input node and the hidden node, the strength of the connection between the hidden node and the output node, and the threshold value. After repeated training and learning, the weight and threshold value corresponding to the minimum error can be determined. In addition, compared to single regression, neural networks can predict the effect of individual variables on the outcome and the effect of interactions between variables on the MCI.
Our strengths are that the DL model showed good performance in both model construction and validation, and our cohort samples are all over the country, which is also well generalized in China. Second, we used ML methods to double-validate the inclusion variables and performed model construction by combining DL and ML. Third, when searching for domestic and international studies, we could not find other studies that predicted MCI in patients with sarcopenia. Lastly, we built an online computational webpage to facilitate the application of the model. The study also has more shortcomings. First, we lacked foreign validation sets to validate the model and could not prove that the model applies to other races and populations abroad. We are actively searching for data that can be utilized in different countries and regions for validation to solve this problem as early as possible. Second, limited by public databases, including variables related to the onset of MCI, such as the exercise situation, is not comprehensive enough due to the large missing amount and not being included in the study. Third, regarding the assessment of variables, it is generally measured using BIA or DXA, but the formula we used for ASM calculation was also in better agreement with DXA. For walking speed, CHARLS uses measurements at a standard distance of five meters, whereas the international standard is a distance of six meters. To address this issue, it has also been documented that this distance does not affect walking speed [30]. Finally, since the included samples were all over 45 years of age, the DL model is mainly applicable to the elderly population and is not relevant to the prediction of all age groups.