Participants
For the Mobilise-D technical validation study (TVS), 108 participants were recruited from five clinical cohorts; CHF, COPD, MS, PD, and PFF alongside HA. Participants were recruited at five sites: The Newcastle upon Tyne Hospitals NHS Foundation Trust, UK (Sponsor of the study) and Sheffield Teaching Hospitals NHS Foundation Trust, UK (ethics approval granted by London – Bloomsbury Research Ethics committee, 19/LO/1507); Tel Aviv Sourasky Medical Center, Israel (ethics approval granted by the Helsinki Committee, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel, 0551-19TLV), Robert Bosch Foundation for Medical Research, Germany (ethics approval granted by the ethical committee of the medical faculty of The University of Tübingen, 647/2019BO2), University of Kiel, Germany (ethics approval granted by the ethical committee of the medical faculty of Kiel University, D438/18). Informed consent was provided by all participants to take part in the study and all research was performed in accordance with the Declaration of Helsinki. Inclusion and exclusion criteria are fully described in 24.
Protocol
The protocol has been extensively detailed in 24. Participants were assessed in the laboratory and during a 2.5-hour real-world observation. Mobility data was collected with a wearable device (McRoberts Dynaport MM+, sampling frequency: 100 Hz, triaxial acceleration range: ±8 g / resolution: 1 mg, triaxial gyroscope range: ±2000 degrees per second (dps)/ resolution: 70 mdps), secured at the waist with a Velcro belt. Participants were also asked to wear a multisensor INDIP reference system (sampling frequency: 100 Hz) 24,30. Specifically, two magneto-IMUs were positioned over the instep and fixed to shoelaces with clips, and a third IMU was attached the lower back with Velcro. Distance sensors were then positioned asymmetrically with Velcro (one above left ankle and another 3cm higher on the right leg). Pressure insoles were selected for each participant’s foot size and inserted into the shoe. The INDIP system has been validated in previous studies across a range of conditions and in this TVS cohorts also, showing excellent results and reliability in the qualification of mobility outcomes (MAE laboratory ≤ 0.02 m/s, simulated daily activities = 0.03 to 0.05 m/s), a complete overview of the validation results can be found in 31. The INDIP and the wearable device were synchronised using their timestamps (± 10 ms). Participants only performed tasks that they felt comfortable and safe to do so in both protocols.
Laboratory protocol
Participants were asked to complete seven motor tasks with increasing complexity: Straight walking (slow, normal and fast speed), Timed Up and Go, L-Test, Surface Test, Hallway Test and Simulated Daily Activities. Each task was designed to capture and assess various elements associated with real-world walking including a range of walking speeds, incline/steps, surface, path shape, turns and specific motor tasks to simulate typical real-world transitions 24,37.
Real-world protocol
Participants were assessed for up to 2.5 hours in the real-world, as they went about their normal activities unsupervised (home/work/community/outdoor). To ensure a variety of walking activities were captured, participants were asked to perform specific tasks such as outdoor walking, walking up and down a slope and stairs and moving from one room to another, only if they felt comfortable and safe to do so 24.
Calculation of Walking Speed
The evaluation of walking speed requires the combination of various algorithmic steps, including the identification of gait sequences and of initial contacts, estimation of Digital Mobility Outcomes (DMOs), i.e., cadence and stride length. The performance and selection of state-of-the-art algorithms to detect gait-sequences, estimate initial contact events, cadence and stride length within identified gait-sequences was determined in our previous work 26 (Figure 1). The best performing algorithm was then used to estimate walking speed using the outputs of the stride length and cadence algorithms using eq. (1):
W (1)
< Insert Figure 7 >
Two independent analytical pipelines (P1 and P2) were identified in this process due to differences in algorithm selection for gait-sequence detection and cadence for the different conditions included in the study 26. P1 provides the optimal combination of algorithms selected for HA, COPD, and CHF conditions, and P2 provides the optimal combination for PD, MS, and PFF (Figure 7).
Two additional algorithms were added to both gait analysis pipelines: turn detection algorithm 47, and a customised algorithm to detect the laterality (left or right step) of each IC 48. Laterality was used to interpolate the cadence, stride length, and walking speed parameters (provided as per-second values by the algorithms) to stride-level values (stride interpolation).
DMOs were evaluated on a stride level, conforming to consensus agreed definitions 49 for WBs. Accordingly, a WB was defined as a continuous sequence containing at least two consecutive strides of both feet (e.g., R-L-R-L-R-L or L-R-L-R-L-R, being R/L the right/left foot contact with the ground); consecutive WBs were defined if a break greater than 3 s was identified between them; and, for a stride to be included in a given WB it had to have a duration between 0.2s and 3.0 s and a stride length > 0.15 m 50. WBs compliant with this definition were generated by first filtering the list of identified strides based on the stride level definition (stride selection) and then grouped into final WBs based on breaks within the stride sequence (walking bout assembly). The same definitions were also used to define the WBs for the reference system. For both systems final DMOs were calculated as the average value over all strides within a WB.
Validation of Walking Speed
All comparisons between the wearable device and the reference system where performed based on the average walking speed within each identified WB. In addition, comparison results for cadence and stride length can be found in Supplementary tables 1 & 2.
Real-world recordings also provide new challenges during data analysis. WBs detected by the wearable device, and reference system might not match up, thus direct comparison of individual strides or WBs is not possible. One straight-forward approach is to average DMOs across all WBs before comparison. However, this reduced granularity makes it difficult to fully understand under which circumstances a wearable device works well and can “mask” the bias or error (e.g., over or underestimation under specific circumstances) that only considering single WB could be identified. We proposed a new approach for these types of data analysis, by splitting the analysis into a detailed comparison of only WBs that were identified in both systems (True positive WBs) and a traditional analysis of all data combined (Combined WBs).
- True Positive Evaluation: Novel method of analysis, which directly compares the performance of the DMOs on only the WBs that were detected in both systems (true positives). This allows for the calculation of traditional of comparison metrics (e.g., inter-class-correlation and Bland-Altman plots), that require a direct comparison of individual measurement points. WBs were included in the true positive analysis, if there was an overlap of more than 80% between the two systems (details about the selection of this threshold can be found in Supplementary figure 1).
- Combined Evaluation: Traditional method of analysis, where we calculated the median walking speed from all identified WBs within each laboratory task (resulting in one value per gait task per participant) or within the 2.5 hours real-world assessment per participant (resulting in one value per participant) and compared these combined values between the systems. This comparison is free of potential biases introduced by the selection of only the true-positive WB and reflects how DMOs will typically be evaluated in a research or clinical setting or when reference data may not be available.
Factors than can influence walking speed validity
A range of factors can influence walking speed, and this may impact on the algorithm performance and validity of the results. We investigated the possible sources of confounding such as: the cohort, environment (laboratory vs real-world), task complexity, walking speed and walking bout duration, and participant performance upon walking speed validity. All comparisons (unless otherwise stated) are performed using WBs identified as true positive (true positive evaluation).
Influence of the cohort and environment
We compared errors in the estimation of walking speed between each of the different clinical cohorts included in the study, alongside differences between laboratory and real-world environments.
Influence of gait task complexity
Real world walking contains complex gait sequences, which are comprised of short steps, frequent turns, or obstacle negotiation where individuals often multitask during walking. Thus, gait patterns observed in the real-world are not comparable with the straight walking tasks undertaken in controlled environments, even if we account for differences in WB duration and walking speed. To assess the effect of gait-task complexity we compared validation results of walking speed estimated from the (i) simulated daily activities (high complexity), (ii) slow straight walking (low complexity), (iii) all straight walking tasks (low complexity), and (iv) all laboratory tests with the validation results of walking speed estimated from real-world walking. We further subdivided real world walking based on the percentage of a WB that was assessed to be a turn. Based on this we defined the following levels of gait complexity: (v) “simple” straight gait (<20% covered by turns) and (vi) “complex gait” (>=60% covered by turns).
Influence of walking speed and walking bout duration
Given the impact of real-world WB durations and speeds 40 on the adopted biomechanical strategies 51, we analysed their influence on the validity of the walking speed. For this, we assessed whether validity of walking speed estimation differed within specific WB durations bins (< 10 seconds, > 10 seconds, 10 to 30 seconds, 30 to 60 seconds, > 60 seconds and > 120 seconds). This was first performed for all true positive WBs comparing their errors across each WB threshold, and subsequently repeated for the combined analysis, by calculating the median walking speed for each participant within the respective speed bout and comparing the median values between the reference system and the wearable device. All these analyses permitted the validation of the quantification of walking speed across different walking strategies.
Validation measures
For all types of evaluations (all available WBs/aggregated values or on the respective subgroups), we calculated various statistical/comparison measures to quantify the walking speed estimation error for the sensitivity analysis:
- Intra Class Correlation Coefficient (ICC(2,1)) 52 was calculated to assess the association between the DMOs of the two systems. Based on ICC estimates, values < 0.5, between 0.5 and 0.75, between 0.75 and 0.9, and > 0.90 were deemed to be indicative of poor, moderate, good, and excellent reliability, respectively 53.
- Absolute agreement was assessed by quantifying (i) the accuracy/mean absolute error (MAE), (ii) bias/mean error and (iii) precision/limits of agreements (LoA) 54 between walking speed estimates of both systems.
- Mean relative errors (MRE) and mean absolute relative error (MARE) were estimated as the ratio between the (absolute) errors per WB and the corresponding estimates from the reference system, expressed as a percentage.