A total of 57 participants (mean age: 35.0 years, 47.4% were men) were recruited in the validity study (validity dataset, Figure 1, Table 1) and 23 participants (mean age: 35.0 years, 47.8% were men) were recruited in the test-retest reliability study (reliability dataset, Figure 1, Table 1).
Test-retest reliability, measurement error, and learning effects
The evaluation of test-retest reliability (ICC), measurement error (SRD%), and learning effects (η) can be found in Table 2. Across metrics, the ICC was 0.87±0.11 (median ± inter-quartile range), with only the path length ratio return (ICC 0.62) and the jerk peg approach (ICC 0.32) being below the cut-off of 0.7. The SRD% across metrics was 30.9±18.7, with the measurement error being below the cut-off of 30.3 for five metrics, namely log jerk return (27.5), SPARC return (25.4), velocity max. return (23.8), grip force rate num. peaks transport (28.9), and grip force rate SPARC transport (19.1). No strong learning effects were found between assessment timepoint one and two (η -1.3±2.5) and between assessment timepoint two and three (η -0.1±2.8). Even though a statistically significant learning effect was visible between timepoint two and three for the velocity max. return metric (η -5.07, p=0.019, t=-2.51, DoF=22), the effect was not deemed as strong according to the cut-off (-6.35%).
When only considering three instead of five repetitions of the VPIT (Table SM1), across metrics, the ICC reduced by 0.04±0.06 (min 0.01, max 0.30 for jerk peg approach), the SRD% increased by 1.72±3.75 (min 0.05, max 32.27 for jerk peg approach), η between timepoint one and two increased by 0.02±1.90 (min 0, max 6.28 for log jerk return), and η between timepoint two and three increased by 1.08±5.84 (min 0, max 7.04 for log jerk return). When only considering three instead of five repetitions, the same eight metrics were reliable, two additional metrics had insufficient measurement error (log jerk return 31.3 and grip force rate num. peaks transport 31.7), and no metrics showed strong learning effects.
Table 1 Characteristics of the participants used for evaluating the reliability and validity of the VPIT metrics.
Characteristic
|
ARSACS validity dataset
(n=57, 1 session)
|
ARSACS reliability dataset
(n=23, 3 sessions)
|
|
Age, (y)
|
35.0±13.5 (16-61)
|
35.0±11.0 (27-57)
|
|
Sex, n (%)
Men
Women
|
27 (47.4)
30 (52.6)
|
11 (47.8)
12 (52.2)
|
|
Homozygous, n (%)
|
52 (92.8)
n=56
|
23 (100)
|
|
SARA (0-40)
|
19±14 (4-36)
n=56
|
20.5±10.6 (7-36)
|
|
NHPT (s)
|
29.7±33.7 (23.9-144.9)
n=56
|
45.3±18.4 (23.9-105.5)
|
|
SFNT (# of repetitions)
|
10.4±4 (5.8-21.3)
|
10.5±4.3 (6.0-21.3)
|
|
Grip strength (kg)
|
29.2±15.9 (17.2-59.1)
n=55
|
24.7±16.6 (17.2-59.1)
|
|
Pinch strength (kg)
|
5.7±2.2 (3.3-10.3)
n=55
|
5.8±1.9 (3.3-9.2)
n=22
|
|
LEMOCOT (# of repetitions)
|
19. 0±15.3 (1-48)
n=49
|
18.5±6.0 (1-40)
n=18
|
|
Barthel index (0-100)
|
90±20 (35-100)
n=54
|
85.0±27.5 (45-100)
n=21
|
|
Values reported as median ±interquartile range (minimum-maximum). If missing values were present, n denotes the number of participants without missing values. ARSACS: Autosomal Recessive Spastic Ataxia of Charlevoix-Saguenay. LEMOCOT: Lower Extremity Motor Coordination Test. NHPT: Nine Hole Peg Test. SFNT: Standardized Finger-Nose Test. SARA: Scale for the Assessment and Rating of Ataxia. VPIT: Virtual Peg Insertion Test.
Concurrent validity
The hypothesis related to the expected correlations with the Nine Hole Peg Test was partially fulfilled, as significant moderate correlations were found with metrics describing movement control (SPARC return ρ=-0.32, p=0.017 and velocity max. return ρ=0.46, p=0.0004), but no significant correlations were observed with metrics describing grip force control. The hypothesis related to the expected correlations with the Standardized Finger-Nose Test was fulfilled, as significant moderate correlations were observed with metrics describing movement control (SPARC return ρ=-0.39, p=0.003 and velocity max. return ρ=-0.51, p<0.0001). The hypothesis related to the expected correlations with grip strength was fulfilled, as no significant correlations were observed. The hypothesis related to the expected correlations with pinch strength was partially fulfilled, as no significant correlations with metrics of grip force control were observed, but instead a significant correlation with a metric of arm control (velocity max. return ρ=-0.39, p=0.0039). The hypothesis related to the expected correlations with the LEMOCOT was not fulfilled, as significant correlations were observed with metrics of arm control (velocity max. return ρ=-0.32, p=0.021) and grip force control (grip force rate num. peaks transport ρ=-0.53, p<0.0001 and grip force rate SPARC transport ρ=-0.39, p=0.0044). The hypothesis related to the expected correlations with the Barthel index was fulfilled, as no significant correlations were observed. Thus, overall, three hypotheses related to the concurrent validity of the VPIT metrics were fulfilled, two partially fulfilled, and one not fulfilled.
Table 2 Evaluation of reliability and learning effects of the Virtual Peg Insertion Test (VPIT) metrics considering five task repetitions and three repeated assessment sessions.
Digital health metrics
|
Reliability
(5 VPIT repetitions)
|
Learning effects
(5 VPIT repetitions)
|
|
ICC [CI]
|
SRD%
|
Norm. slope η
(session 1 & 2)
|
Norm. slope η
(session 2 & 3)
|
Log jerk transport
|
0.86 [0.81, 0.90]
|
32.89
|
1.31
|
-0.11
|
Log jerk return
|
0.92 [0.89, 0.94]
|
27.51
|
-2.33
|
2.97
|
SPARC return
|
0.93 [0.90, 0.95]
|
25.42
|
0.20
|
0.14
|
Path length ratio transport
|
0.80 [0.73, 0.86]
|
44.08
|
-2.36
|
3.48
|
Path length ratio return
|
0.62 [0.48, 0.73]
|
45.23
|
-2.95
|
-2.68
|
Velocity max. return
|
0.90 [0.86, 0.93]
|
23.84
|
3.25
|
-5.07*
|
Jerk peg approach
|
0.33 [0.09, 0.52]
|
55.48
|
-1.84
|
-0.09
|
Grip force rate num. peaks transport
|
0.82 [0.76, 0.87]
|
28.92
|
0.00
|
0.00
|
Grip force rate SPARC transport
|
0.91 [0.88, 0.94]
|
19.14
|
-1.42
|
-0.14
|
Grip force rate hole approach
|
0.88 [0.83, 0.91]
|
34.55
|
-1.25
|
-4.29
|
ICC: intra-class correlation. CI: confidence interval. SRD%: smallest real difference. *p<0.05, **p<0.001 for paired t-test between sessions. For all three statistics, accepted cut-offs (ICC >0.7, SRD% <30.3, η> -6.35 or non-significant) were used to determine if a metric fulfils each of the evaluation criteria (values in bold font)
Table 3 Concurrent validity (Spearman correlations) between VPIT digital health metrics and clinical assessments.
Digital health metrics
|
Clinical assessments
|
|
Nine Hole Peg Test
|
Standardized Finger to Nose Test
|
Grip
strength
|
Pinch strength
|
Lower Extremity Motor Coordination Test
|
Barthel
index
|
Log jerk transport
|
0.15
|
-0.18
|
-0.23
|
-0.15
|
-0.01
|
-0.08
|
Log jerk return
|
0.20
|
-0.11
|
-0.12
|
-0.01
|
0.16
|
0.11
|
SPARC return
|
-0.32*
|
-0.39*
|
-0.24
|
-0.1
|
0.01
|
-0.1
|
Path length ratio transport
|
0.25
|
-0.16
|
-0.04
|
0.00
|
0.05
|
-0.12
|
Path length ratio
return
|
0.17
|
-0.13
|
-0.27
|
-0.01
|
0.16
|
0.00
|
Velocity max. return
|
0.46**
|
-0.51**
|
-0.21
|
-0.39*
|
-0.32*
|
-0.24
|
Jerk peg approach
|
0.10
|
0.04
|
-0.15
|
-0.1
|
0.09
|
0.09
|
Grip force rate num. peaks transport
|
0.21
|
-0.18
|
-0.13
|
-0.21
|
-0.53**
|
0.00
|
Grip force rate SPARC transport
|
0.26
|
-0.24
|
0.04
|
-0.16
|
-0.39*
|
-0.15
|
Grip force rate hole approach
|
0.21
|
-0.06
|
-0.12
|
-0.10
|
-0.14
|
-0.09
|
|
Hypothesis partially fulfilled
|
Hypothesis fulfilled
|
Hypothesis fulfilled
|
Hypothesis partially fulfilled
|
Hypothesis not fulfilled
|
Hypothesis fulfilled
|
*p<0.05, **p<0.001. SPARC: spectral arc length.