Evaluating feasibility of lower extremity joint moments predicted by an artificial intelligence model during walking in patients with cerebral palsy

doi:10.21203/rs.3.rs-4124385/v1

Download PDF

Article

Evaluating feasibility of lower extremity joint moments predicted by an artificial intelligence model during walking in patients with cerebral palsy

https://doi.org/10.21203/rs.3.rs-4124385/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Several studies have highlighted the advantages of employing artificial intelligence (AI) models in gait analysis. However, the practicality of AI models into clinical gait routines remains uncertain. In this study, we propose a three-step approach to evaluate the feasibility of a previously developed AI model. This model predicts joint moments during walking for 622 patients with cerebral palsy using joint kinematics as the input. Root mean square errors between lab-measured and predicted joint moments are labeled as Green (acceptable), Yellow (acceptable with caution), or Red (unacceptable). Kinematics are classified accordingly, and statistical analyses determine their impact on the AI model output. A linear discrimination analysis (LDA) model predicts labels for newly predicted joint moments based on kinematics. The knee moment showed the largest Green label population (73%), while the ankle moment has the smallest (34%). Gait profile score show significant differences between all labels except the Green vs Red for the ankle joint. The LDA model achieves75% accuracy for knee joint moment prediction, with a Green sensitivity of 94.7%. Severer patient’s condition leads to increase in Red population. While the AI model shows promise for predicting knee and hip moments, further development is necessary before its integration into clinical routines.

Physical sciences/Engineering/Biomedical engineering

Health sciences/Diseases/Neurological disorders/Movement disorders

Biological sciences/Computational biology and bioinformatics/Data mining

Health sciences/Health care/Quality of life

Artificial Intelligence

Gait

Cerebral Palsy

Feasibility

Joint Moment

Today, the use of artificial intelligence (AI) has grown significantly and rapidly in various branches of science, and motion analysis science has not been left untouched in this regard ^1,2.

AI models have provided the capability to look deeper into the musculoskeletal system and movement disorders, which were not achievable by conventional models ^3–5. Shin et al. discussed the benefits of AI in facilitating and increasing the efficiency of musculoskeletal imaging workflow in clinical applications in a review study ⁶. In another study, Sharma et al. trained different AI models to estimate the kinematics and kinetics features obtained from conventional musculoskeletal models. They used inertial motion capturing data of five typically developed adults as inputs to the AI models and suggested the AI model as a promising approach ⁷.

Regarding motion analysis, AI models can be time- and cost-efficient by avoiding direct measurement of certain parameters. This can be achieved through motion pattern recognition for diagnosis, classification, and prediction ^8,9.

However, training an AI model necessitates a vast dataset, which limits the availability of studies focusing on patients with musculoskeletal disorders. Ozates et al. addressed this gap by training a convolutional neural network model to predict lower extremity joint moments during walking. They utilized retrospective gait data from 132 typically developed individuals and 622 patients with bilateral cerebral palsy (CP), drawing from a sizable database for their study ¹⁰.

Overall, the majority of the AI models in movement analysis still remain within the research domain, and their performance is typically evaluated using standard criteria such as root mean square error (RMSE) and Pearson correlation ¹¹. While these criteria are important for quantitatively evaluating the AI model performance, we believe they do not tell the whole story. There is a need for more detailed evaluation of the AI outputs from a clinical perspective, incorporating patient conditions into the assessment process. Consequently, more detailed assessment will help to get closer to using AI in clinical routines.

In this follow-up investigation, our aim was to comprehensively scrutinize the output of the AI model used in the study by Ozates et al. ¹⁰ concerning patients with CP during walking. To achieve this objective, we employed a three-step assessment approach. Firstly, we defined a clinically meaningful threshold for lower extremity joint moments during walking to assess measurement errors. To the best of our knowledge, such a threshold has not been fully established yet. Secondly, we labeled the model outputs based on this threshold, using a color-coded system (Green as acceptable, Yellow as acceptable with caution, and Red as unacceptable). Subsequently, we traced back the kinematic inputs and classified them according to their corresponding output’s label to evaluate the performance of the AI model in more detail. Lastly, to assess the practicality of the AI model, we attempted to predict the label of newly predicted joint moments based on their kinematics (Fig. 1).

Figure 1 here

Data

This study utilized the same input and output data as Ozates et al ¹⁰. The dataset comprises anonymized retrospective gait kinematics and kinetics data from 622 CP patients with spastic diplegia, walking at self-selected speeds. All methods were carried out in accordance with relevant guidelines and regulations (Declaration of Helsinki). The study was approved by the local ethical committee of the Heidelberg University Hospital (S-227/2021). Also, informed consent was obtained from all subjects and/or their legal guardian(s). All patients demonstrated the ability to walk barefoot without requiring any walking assistance. In the referenced study, a convolutional neural network model was trained to predict dorsi-plantar flexion, knee flexion-extension, and hip flexion-extension moments using laboratory-measured equivalent joint moments as targets and trunk, pelvis, hip, knee, and ankle kinematics as inputs. Only data from the stance phase were utilized throughout the process. For further details, we refer you to Ozates et al ¹⁰.

AI model (the Convolutional Neural Network)

Given that the AI model, employed in this study, is thoroughly detailed in reference [10], a concise overview of the model is provided in this section. The implemented machine learning architecture is based on a one-dimensional convolutional neural network, which had five convolutional layers with an increasing number of filters [128, 128, 512, 1024, 2048] and decreasing one-dimensional kernel sizes [30, 15, 10, 5, 3]. This architecture aimed to extract features for varying time intervals. Following the convolutional layers, ten densely connected layers were employed, with a descending number of neurons [10000, 8000, 6000, 4000, 3000, 2000, 1000, 500, 250, 100], as a general approach for transforming information to the desired output size. The model was trained with a 10-fold cross validation algorithm in order to maximize the success the prediction of joint moments ¹⁰.

Joint moment threshold

A 5-degree measurement error, as the maximum allowable limit for gait kinematics, is well established in clinical routines ¹². However, there is not such a clear criterion regarding the joint moments. In this regard, Meldrum et al. provided detailed results, including the standard error of measurement (SEM), along with the averaged peak of moments (Nm/kg) for the test-retest reliability of three-dimensional gait analysis for a cohort of 30 typically developed adults ¹³. Using their results, we normalized each mean peak joint moments to their corresponded SEM. These values varied from 17–25% of the maximum peak moments.

Lobet et al. reported the same parameters as Meldrum et al. for the 3D gait analysis of a cohort of patients with hemophilia experiencing blood-induced joint issues ¹⁴. Applying the same normalization, the normalized peak joint moments (to the SEM) were varied between 12.1% and 24% of the maximum joint moments. Additionally, besides evaluating the relationship between peak joint moment and measurement error, we examined clinically relevant changes in the moments.

Foucher (2016) showed that the average minimum clinically important improvement for the peak hip adduction moment after total hip arthroplasty for a cohort of females, with an average age of 61 years, is 0.87% Body Weight × Body Height (approximately equal to 14.75% Nm/kg, assuming an average height of 1.73m). In another study, Miyazaki et al. ¹⁵ showed that knee osteoarthrosis increases 6.46 times with a 1% Body Weight × Body Height increase in the knee adduction peak moment (approximately 17% Nm/kg). Lastly, Schwarze et al. ¹⁶ compared the effect of laterally wedged insoles and ankle-foot orthosis in patients with medial knee osteoarthritis. They concluded that a 5%-10% reduction in the maximum knee adduction moment during walking, typically achieved with lateral wedge insoles, may already be clinically meaningful.

The above data illustrates a notable range of variation in the normalized peak joint moments, indicating sensitivity to measured errors. To effectively assess changes in joint moments, we proposed the implementation of two thresholds instead of one. Conservatively, we proposed the lower and upper kinetic limits (LKL, UKL) to evaluate measurement error of a joint moment in the lower extremity while being clinically relevant.

$$LKL=10\%{MaxMom}^{*}$$

$$UKL=20\%{MaxMom}^{*}$$

Where, ${MaxMom}^{*}$ is the standardized peak joint moment.

$${MaxMom}^{*}= \frac{|MaxMom – MeanMom| }{StdMom}$$

Where, $MaxMom$ (Nm/kg) is the maximal value of the joint moment, $MeanMom$ (Nm/kg) is the average of the joint moment and, $StdMom$ (Nm/kg) is the standard deviation of the joint moment, all during stance phase.

Applying these limits, it is possible to classify the joint moments into three labels using a color-coded system. The values smaller than LKL are considered acceptable (Green), those between LKL and UKL are considered acceptable but with caution (Yellow), and finally, values bigger than UKL considered unacceptable (Red).

Kinematic features

Originally, the AI model predicted the joint moments using body kinematics as input in all three planes. However, in this study, we limited our assessment to the sagittal plane during the stance phase.

We calculated the RMSE for each comparison between the measured and predicted moments of hip flexion/extension (hip moment in Nm/kg), knee flexion/extension (knee moment in Nm/kg), and ankle dorsiflexion/plantarflexion (ankle moment in Nm/kg) separately. It should be noted that as the joint moments already had been normalized to body mass (kg), the RMSEs were not normalized to the moment’s peak to peak values.

The corresponding measured joint angles (in degrees), serving as inputs to the AI model in the sagittal plane, included pelvic tilt (PelvicTilt), hip flexion/extension (HipFlx), knee flexion/extension (KneeFlx), and ankle dorsiflexion/plantarflexion (AnkleDorsi). In addition to the kinematics, the gait profile score (GPS) ¹⁷ was calculated for each subject. However, there was one difference: the foot progression angle was omitted from the calculation. The GPS value was the only feature that angles from all three planes were included in the calculation.

After the joint moments were labeled by LKL and UKL based on their RMSEs (Fig. 2), the kinematics were grouped according to their corresponding moment labels. Next, kinematic features including maximal (MAX), minimal (MIN), mean (MEAN), and range of motion (ROM) were extracted for each label, and GPS values were calculated as well. An ANOVA one-way test and Bonferroni post-hoc statistical tests were performed on the kinematic features and GPS between all labels to identify which features significantly affected the output of the AI model (Pvalue = 0.05).

To examine whether it is possible to determine the label of a newly predicted joint, the entire cohort was divided into two groups: a training group consisting of 550 subjects (90% of the population) and a test group consisting of 72 subjects (10% of the population).

After statistical analysis were performed on the training group (Table 1, Table 2, Fig. 2, Fig. 3), a linear discrimination analysis (LDA) model ¹⁸ was established using the kinematics features of the training group as predictors and their moment labels (Green, Yellow, Red) as output. Only kinematics features with significant differences between at least two labels (Table 2) were used to train the LDA model. The LDA model was used to predict the label of a newly predicted joint moments. The test group was applied to evaluate the LDA model.

Table 1

Population of each label classified using the LKL and UKL for the hip, knee, and ankle moments (Nm/kg) in number and percentage for the training group. The mean ± std (standard deviation) RMSE of each label, along with their corresponding P-values between labels, are also presented. The significant values are bolded (Pvalue = 0.05).
	Hip Moment (Nm/kg)			Knee Moment (Nm/kg)			Ankle Moment (Nm/kg)
	Population	%	RMSE	Population	%	RMSE	Population	%	RMSE
Green	242	44	0.15 ± 0.05	403	73	0.12 ± 0.04	188	34	0.10 ± 0.03
Yellow	269	49	0.22 ± 0.08	127	23	0.22 ± 0.08	288	52	0.18 ± 0.05
Red	39	7	0.37 ± 0.18	20	4	0.41 ± 0.07	74	14	0.31 ± 0.1
Pvalue (Green vs Yellow)			< 0.001			< 0.001			< 0.001
Pvalue (Green vs Red)			< 0.001			< 0.001			< 0.001
Pvalue (Yellow vs Red)			< 0.001			< 0.001			< 0.001

Table 2

Averaged (mean ± std) kinematics features (MAX, MIN, ROM, MEAN, and GPS) during the gait stance corresponding to each label for each joint moment. The Pvalues between labels are presented for each joint, and the features with significant differences are presented in bold (Pvalue = 0.05).
	MAX (Deg)				MIN (Deg)				ROM (Deg)				MEAN (Deg)				GPS
Hip Moment (Nm/kg)	PelvicTilt	HipFlx	KneeFlx	AnkleDorsi	PelvicTilt	HipFlx	KneeFlx	AnkleDorsi	PelvicTilt	HipFlx	KneeFlx	AnkleDorsi	PelvicTilt	HipFlx	KneeFlx	AnkleDorsi
Green	18.4 ± 5.9	39.3 ± 7.6	34.1 ± 8.8	12.4 ± 8.4	12.3 ± 5.6	-0.6 ± 8.9	10.5 ± 11.1	-9.32 ± 10.3	6.1 ± 2.8	39.8 ± 8.5	23.5 ± 7.4	21.7 ± 6.9	15.7 ± 5.6	17.1 ± 7.7	19.2 ± 9.8	5.7 ± 7.8	12.7 ± 3.3
Yellow	19.1 ± 7.0	41.0 ± 9.1	37.5 ± 11.1	12.6 ± 7.7	11.7 ± 7.2	1.9 ± 10.1	13.6 ± 14.0	-9.15 ± 10.1	7.4 ± 3.3	39.1 ± 9.5	23.9 ± 8.1	21.7 ± 6.6	15.8 ± 6.9	18.8 ± 8.7	22.5 ± 12.9	5.7 ± 7.6	14.1 ± 3.8
Red	20 ± 7.5	44.6 ± 9.7	42.1 ± 16.1	11.3 ± 13.1	10.5 ± 6.5	5.8 ± 12.0	19.6 ± 19.1	-11.6 ± 15.0	9.5 ± 4.1	38.8 ± 12.0	22.5 ± 11.0	22.8 ± 7.8	15.7 ± 6.9	23.4 ± 10.0	28.7 ± 17.1	3.6 ± 12.1	16.4 ± 4.9
Pvalue (Green vs Yellow)	0.70	0.60	< 0.001	1.00	0.80	0.01	0.03	1.00	< 0.001	0.90	1.00	1.00	1.00	0.04	0.006	1.00	< 0.001
Pvalue (Green vs Red)	0.50	< 0.001	< 0.001	1.00	0.30	< 0.001	< 0.001	0.70	< 0.001	1.00	1.00	1.00	1.00	< 0.001	< 0.001	0.50	< 0.001
Pvalue (Yellow vs Red)	1.00	0.04	0.008	1.00	0.90	0.07	0.02	0.60	< 0.001	1.00	0.90	1.00	1.00	0.004	0.008	0.40	< 0.001
Knee Moment (Nm/kg)
Green	18.7 ± 6.2	40.1 ± 7.9	34.8 ± 8.9	12.7 ± 7.5	12.2 ± 6.2	-0.3 ± 8.8	10.9 ± 11.0	-8.9 ± 9.6	6.51 ± 3.1	40.3 ± 9.0	23.8 ± 7.6	21.6 ± 6.5	15.8 ± 5.9	17.4 ± 7.5	19.8 ± 9.7	5.9 ± 6.9	12.9 ± 3.1
Yellow	19.9 ± 7.5	41.7 ± 9.9	38.6 ± 12.4	10.9 ± 10.3	11.8 ± 7.1	4.1 ± 11	14.9 ± 15.8	-11.2 ± 12.6	8.17 ± 3.3	37.6 ± 9.3	23.6 ± 9.2	22.1 ± 7.6	16.3 ± 7.1	20.1 ± 10.1	23.8 ± 14.8	3.9 ± 10.1	15.1 ± 4.2
Red	15.8 ± 7.4	43.5 ± 12.1	51.9 ± 18.9	14.5 ± 12.0	6.6 ± 7.2	10.9 ± 11.1	33.2 ± 19.9	-7.98 ± 14.6	9.18 ± 3.9	32.5 ± 10.1	18.7 ± 7.2	22.4 ± 6.7	11.6 ± 7.1	25.9 ± 11.4	41.4 ± 20.1	7.3 ± 12.0	19.9 ± 5.2
Pvalue (Green vs Yellow)	0.20	0.15	< 0.001	0.10	1.00	< 0.001	0.006	0.10	< 0.001	0.01	1.00	1.00	1.00	0.01	0.002	0.04	< 0.001
Pvalue (Green vs Red)	0.17	0.22	< 0.001	1.00	< 0.001	< 0.001	< 0.001	1.00	< 0.001	< 0.001	0.01	1.00	0.01	< 0.001	< 0.001	1.00	< 0.001
Pvalue (Yellow vs Red)	0.02	1.00	< 0.001	0.20	0.00	0.008	< 0.001	0.60	0.60	0.06	0.03	1.00	0.01	0.01	< 0.001	0.20	< 0.001
Ankle Moment (Nm/kg)
Green	18.9 ± 5.9	40.4 ± 8.1	34.9 ± 8.5	13.6 ± 7.6	12.5 ± 5.8	-0.6 ± 8.9	10.3 ± 10.7	-7.42 ± 9.3	6.4 ± 3.2	41.0 ± 8.5	24.6 ± 7.2	21.1 ± 6.3	16.0 ± 5.6	17.4 ± 7.9	19.5 ± 9.3	6.4 ± 7.1	12.7 ± 3.0
Yellow	19 ± 6.7	40.9 ± 8.6	36.8 ± 11.0	11.8 ± 8.7	11.8 ± 6.5	1.8 ± 9.8	13.6 ± 13.4	-10.5 ± 11.4	7.2 ± 3.3	39.1 ± 9.3	23.2 ± 8.1	22.3 ± 7.3	15.8 ± 6.4	19.1 ± 8.6	22.4 ± 12.3	4.9 ± 8.5	14.0 ± 4.0
Red	18.2 ± 7.5	39.4 ± 9.8	37.7 ± 14.4	11.4 ± 8.9	10.7 ± 7.9	2.9 ± 11.1	14.9 ± 18.3	-10.3 ± 10.1	7.5 ± 3.5	36.5 ± 10.3	22.8 ± 9.0	21.7 ± 5.9	14.8 ± 7.6	18.4 ± 9.7	23.0 ± 17.2	5.1 ± 8.5	14.3 ± 4.4
Pvalue (Green vs Yellow)	1.00	1.00	0.18	0.07	0.70	0.01	0.02	0.006	0.01	0.06	0.20	0.13	1.00	0.12	0.03	0.12	< 0.001
Pvalue (Green vs Red)	1.00	1.00	0.18	0.18	0.12	0.02	0.03	0.10	0.04	0.001	0.30	1.00	0.40	1.00	0.10	0.65	0.006
Pvalue (Yellow vs Red)	0.90	0.60	1.00	1.00	0.60	1.00	1.00	1.00	1.00	0.10	1.00	1.00	0.70	1.00	1.00	1.00	1.00

Two parameters, namely model accuracy and label sensitivity, were utilized to assess the LDA model when applied to the test group. Model accuracy is defined as the ratio of the sum of all correctly predicted labels to the total number of predictions (the size of the test group), while label sensitivity is the probability of correctly predicting each label within its own cohort (the number of correctly predicted labels to the population of that label in the test group determined by LKL and UKL (Fig. 1). All calculations were done in MATLAB (© 1994–2024 The MathWorks, Inc, MA, USA).

All the results presented in this section are in the sagittal plane (except for the GPS) during the stance phase for the training (Table 1, Table 2, and Fig. 2, Fig. 3) and test (Table 3) groups.

Table 3

Population of each label: classified using LKL and UKL, predicted using the LDA model, and the matched between classified and predicted labels for the test group (72 subjects). Label sensitivity (%) and model accuracy (%) of the LDA model are also presented. The last raw presents same results for the sum of the Green and Yellow labels.
	Hip Moment (Nm/kg)					Knee Moment (Nm/kg)					Ankle Moment (Nm/kg)
	Classified	Predicted	Matched	Label Sensitivity	Model Accuracy	Classified	Predicted	Matched	Label Sensitivity	Model Accuracy	Classified	Predicted	Label Sensitivity	Model Accuracy	Accuracy
Green	22	27	12	54.50	61.10	57	66	54	94.70	75.00	35	10	7	20.00	46.00
Yellow	46	44	31	67.40		11	5	0	0.00		28	62	26	92.80
Red	4	1	1	25.00		4	1	0	0.00		9	0	0	0.00
Green + Yellow	68	71	68	100.00	95.80	68	71	67	98.50	93.00	63	72	63	100.00	87.50

Table 1 presents the size of each label (Green, Yellow, and Red) classified using LKL and UKL for the hip, knee, and ankle moments (Nm/kg). The averaged RMSE for each label is presented for the training group (550 subjects) as well.

The largest and smallest populations for Green belong to knee moment (403, 73%) and ankle moment (188, 34%), respectively. Vice versa, the ankle moment has the largest population (288, 52%) for Yellow and Red (74, 14%), while the knee moment has the smallest one (127, 23%, and 20, 4%), respectively. RMSE values show significant differences (Pvalue < 0.001) between all labels for all joints.

Table 2 presents Kinematic features (MAX, MIN, ROM, MEAN, and GPS) for PelvicTilt, HipFlx, KneeFlx, and AnkleDorsi angles (deg) in all labels. The corresponding Pvalues are reported accordingly. KneeFlx MAX, KneeFLX MIN, PelvicTilt ROM, HipFlx MEAN, and KneeFlx MEAN have significant differences between all labels for the hip moment. Similarly, for the knee moment, these features are KneeFlx MAX, HipFlx MIN, KneeFlx MIN, HipFlx MEAN, and KneeFlx MEAN. There is no feature with significant differences between all labels for the ankle moment. Only HipFlx MIN, KneeFlx MIN, and PelvicTilt ROM have significant differences between Green vs. Yellow, and Yellow vs. Red, respectively. GPS values have significant differences between all labels for all joints, except the Green vs. Red for the ankle moment.

Table 3 presents the results of the test group. The highest number of Green matches and label sensitivity between classified (using LKL and UKL) and predicted (using the LDA model) belongs to the knee moment (54, 94.7% respectively), while the lowest belongs to the ankle moment (7, 20%). Conversely, for the Yellow label, the ankle moment has the highest values (26, 92.8%), and the knee has the lowest (0, 0%). The highest and lowest LDA model accuracy belongs to the knee moment (75%) and ankle joint (46%), respectively. The model accuracy for the Green + Yellow condition ranges from 87.5–95.8% for the hip and ankle moments, respectively.

Figure 2. demonstrates the RMSE distribution between Green, Yellow and Red for the hip, knee and ankle joint moments in the train group.

Figure 3. shows the hip, knee, and ankle moments (Nm/kg) in the sagittal plane during the stance phase (%) using LKL and UKL for the training group. The averaged RMSE for each label were reported in each plot title.

In this study, we proposed a three-step approach to assess the outcome of an AI model, which was established to predict lower extremity joint moments during walking in patient with CP ¹⁰, with the aim of evaluating the model’s feasibility in clinical routine. In the first step, we calculated the error (RMSE) between the measured and corresponding predicted moments. Next, we proposed new thresholds to classify the moments into three labels (Green: acceptable, Yellow: acceptable with caution, and Red: unacceptable) based on the computed error. Additionally, we grouped the AI model input (joint kinematics during walking in the sagittal plane) according to their corresponded joint moment label, investigating how changes in kinematics affect the predicted moments. At the last step, we attempted to establish an LDA model to predict the label of a newly joint moment by the AI model.

With significant differences in RMSEs (Pvalues < 0.001) between all labels (Table 1), our proposed thresholds (LKL and UKL) successfully classified the joint moments in presence of deformities (Fig. 2). The advantage of this labeling lies in the fact that LKL and UKL not only consider the measurement error but also take into account clinical relevance. To the best of our knowledge, such thresholds for evaluating the validity of measures for gait kinetics are introduced for the first time.

Technically, if the relative error between the predicted and measured moment (e.g., RMSE) is less than a certain amount (LKL), then we expect that the relative error does not (or only slightly) influence our interpretation of the predicted moment. Thus, individually assessing, the AI model demonstrated the best performance regarding the knee joint moment, with 73% labeled as Green (acceptable) and only 4% as Red (unacceptable). Conversely, for the ankle joint moment, with 14% labeled as Red and only 34% as Green, the performance could be considered poor. As for the hip joint moment, the population was more evenly distributed between Green and Yellow (44% and 49%, respectively), with only 7% as Red, suggesting moderate performance. Considering the average results for the entire training group population (50.6% Green, 41.3% Yellow, and 8.3% Red), the overall performance of the AI model was rated as moderate.

The results for the kinematic features presented in Table 2 were aligned with the joint moment label results. The kinematic differences between labels could be distinguished by six features (Pvalues < 0.05) for the hip moment, with ROM pelvic tilt as the most prominent feature, having the lowest Pvalues between labels (< 0.001). This finding aligned with the work of Wolf et al. ¹⁹, who identified pelvic tilt ROM during stance as the most relevant feature out of more than 3000 features to characterize CP gait. It is noteworthy that the greater the ROM pelvic tilt, the more severe the patients' conditions become (ranging from 6.1 to 9.5 degrees from Green to Red). The most prominent feature for the knee moment was MAX knee flexion (Green to Red: 34.8 to 51.9 (deg)). This feature, along with MEAN knee flexion (Green to Red: 19.8 to Red 41.4 (deg)), indicated a shift from mild to severe crouch knee gait in patients with CP ²⁰. There was no feature regarding the ankle moment exhibited significant difference for Yellow vs. Red.

Another prominent feature (Pvalue < 0.001) was the GPS value. The averaged GPS values significantly increased from Green to Red (ranging from 12.7 to 16.4 degrees, 12.9 to 19.9 degrees, and 12.7 to 14 degrees) for the hip, knee, and ankle moments, respectively, except for the ankle moment in the Yellow vs. Red comparison. The GPS values indicated that kinematics had the tendency to deviate further from typically developed reference through Green to Red for all the joint moments. Overall, comparing the kinematic results with the population of labels, it can be concluded that in the presence of more severe deformities and a more strongly varying gait pattern, the probability of the predicted joint moment being labeled as Red increased.

A linear discriminate model was established to investigate whether the AI model could be applied for clinical routine use. Since the model accuracy, as a ratio of all correctly predicted labels to the test population, is very general, the label sensitivity was presented in Table 3 as the probability of correctly predicted labels in each true label population. Following the previously discussed results, the LDA model for the knee joint moment had the highest accuracy (75%) with a Green sensitivity of 94.7%, while the ankle joint moment performance was poor. Additionally, both the model accuracy and label sensitivity for all joint moments increased significantly when the two labels of Green and Yellow were combined. Overall, the Green and Red labels can be accepted as they are for all joint moments. For the knee joint moment, Yellow can be accepted with good confident, considering that only a few may belong to Red, while for the ankle joint moment, Yellow label should to be considered with high caution, indicating a high tendency toward Red.

There are limitations to this study. The AI model used kinetics and kinematics in all three planes, while for this study, only the sagittal plane moments and angles were considered. This limitation may affect the accuracy of the LDA model, indicating a scope for improvement in the label predictor model. Future work should focus on enhancing the AI model to achieve better predictions of joint moments in the presence of severe deformities. Additionally, developing the LDA model to incorporate kinetics and kinematics of all planes would be beneficial.

Overall, the three-step assessment through the labeling the joint moments appears to be a helpful approach. The three-color coded system is simple yet practical in classifying the data. Utilizing the AI model to predict the joint moments in the sagittal plane for patients with CP having mild or moderate severity is recommendable, particularly for the knee and hip joint moments. However, the general performance is still rated as moderate. Although AI models can be time- and cost-effective and facilitate the clinical applications, they still require further development to be considered as an adequate substitution for daily clinical gait measurement routines.

AUTHOR CONTRIBUTIONS

F.S., wrote the main manuscript, planned the methods, prepared and analyzed the result. Y.Z.A., and M.E.O., Established the AI model, and provided the AI model outcomes. S.I.W., Planned the methods, supervised the study, analyzed the results. All authors reviewed and approved the final version of the manuscript. F.S., is the corresponding author.

Competing interests

The author(s) declare no competing interests.

DATA AVAILABILITY

Due to patient data confidentiality, authors are not allowed to share the data publicly. Also, data are not

available upon simple request to the corresponding author, but only via a regular study.

Khera, P. & Kumar, N. Role of machine learning in gait analysis: a review. Journal of Medical Engineering & Technology 44, 441–467 (2020).
Zsarnoczky-Dulhazi, F., Agod, S., Szarka, S., Tuza, K. & Kopper, B. Ai based motion analysis software for sport and physical therapy assessment. Revista Brasileira de Medicina do Esporte 30, e2022_0020 (2023).
Debs, P. & Fayad, L. M. The promise and limitations of artificial intelligence in musculoskeletal imaging. Frontiers in Radiology 3 (2023).
Galbusera, F., Casaroli, G. & Bassani, T. Artificial intelligence and machine learning in spine research. JOR spine 2, e1044 (2019).
Takeda, I., Yamada, A. & Onodera, H. Artificial Intelligence-Assisted motion capture for medical applications: a comparative study between markerless and passive marker motion capture. Computer methods in biomechanics and biomedical engineering 24, 864–873 (2021).
Shin, Y., Kim, S. & Lee, Y. H. AI musculoskeletal clinical applications: how can AI increase my day-to-day efficiency? Skeletal Radiology, 1–12 (2022).
Sharma, R., Dasgupta, A., Cheng, R., Mishra, C. & Nagaraja, V. H. Machine learning for musculoskeletal modeling of upper extremity. IEEE Sensors Journal 22, 18684–18697 (2022).
Kolaghassi, R., Al-Hares, M. K. & Sirlantzis, K. Systematic review of intelligent algorithms in gait analysis and prediction for lower limb robotic systems. IEEE Access 9, 113788–113812 (2021).
Molaviaan, R., Fatahi, A., Abbasi, H. & Khezri, D. Artificial Intelligence Approach in Biomechanical Analysis of Gait. Journal of Advanced Sport Technology 7, 23–37 (2023).
Ozates, M. E., Karabulut, D., Salami, F., Wolf, S. I. & Arslan, Y. Z. Machine learning-based prediction of joint moments based on kinematics in patients with cerebral palsy. Journal of Biomechanics 155, 111668 (2023).
Ardestani, M. M. et al. Human lower extremity joint moment prediction: A wavelet neural network approach. Expert Systems with Applications 41, 4422–4433 (2014).
Wilken, J. M., Rodriguez, K. M., Brawner, M. & Darter, B. J. Reliability and minimal detectible change values for gait kinematics and kinetics in healthy adults. Gait & posture 35, 301–307 (2012).
Meldrum, D., Shouldice, C., Conroy, R., Jones, K. & Forward, M. Test–retest reliability of three dimensional gait analysis: Including a novel approach to visualising agreement of gait cycle waveforms with Bland and Altman plots. Gait & posture 39, 265–271 (2014).
Lobet, S., Detrembleur, C., Francq, B. & Hermans, C. Natural progression of blood-induced joint damage in patients with haemophilia: clinical relevance and reproducibility of three‐dimensional gait analysis. Haemophilia 16, 813–821 (2010).
Miyazaki, T. et al. Dynamic load at baseline can predict radiographic disease progression in medial compartment knee osteoarthritis. Annals of the rheumatic diseases 61, 617–622 (2002).
Schwarze, M. et al. A comparison between laterally wedged insoles and ankle-foot orthoses for the treatment of medial osteoarthritis of the knee: A randomized cross-over trial. Clinical Rehabilitation 35, 1032–1043 (2021).
Baker, R. et al. The gait profile score and movement analysis profile. Gait & posture 30, 265–269 (2009).
Fisher, R. A. The use of multiple measurements in taxonomic problems. Annals of eugenics 7, 179–188 (1936).
Wolf, S. et al. Automated feature assessment in instrumented gait analysis. Gait & Posture 23, 331–338 (2006).
Lin, C.-J., Guo, L.-Y., Su, F.-C., Chou, Y.-L. & Cherng, R.-J. Common abnormal kinetic patterns of the knee in gait in spastic diplegia of cerebral palsy. Gait & posture 11, 224–232 (2000).

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Evaluating feasibility of lower extremity joint moments predicted by an artificial intelligence model during walking in patients with cerebral palsy

Status:

Version 1

Abstract

Figures

Introduction

Method

Data

AI model (the Convolutional Neural Network)

Joint moment threshold

Kinematic features

Results

Discussion

Declarations

References

Additional Declarations

Status:

Version 1