The following experiment was approved by the Northwestern Institutional Review Board. Fourteen individuals with intact limbs (ITL) and seven individuals with below-elbow amputations (AMP, Table 1) participated in this study after providing written informed consent. Due to partial data loss, results from one ITL participant and one AMP participant were excluded from the final analysis.
Table 1
Amputee Subject Demographics
Subject
|
Age
|
Gender
|
Time Since Amputation
|
Level Of Amputation
|
DOFs Controlled
|
AMP1
|
73
|
M
|
32 years
|
Transradial
|
3DOF
|
AMP2
|
33
|
M
|
5 years
|
Wrist disarticulation
|
3DOF
|
AMP3
|
65
|
M
|
6 years
|
Transradial
|
2DOF
|
AMP4
|
56
|
M
|
40 years
|
Transradial
|
2DOF
|
AMP5
|
48
|
M
|
11 months
|
Transradial
|
2DOF
|
AMP6
|
19
|
M
|
10 months
|
Transradial
|
2DOF
|
Experimental Setup
For ITL participants (Fig. 1a), six channels of EMG signals were collected using dry stainless-steel bipolar electrodes (Motion Control Inc.) that were embedded in an adjustable armband. The electrodes were equally spaced around the subject’s right arm, with the reference electrode positioned just distal to the olecranon. An HTC Vive tracker was attached to the dorsal side of the armband and used to track the participant’s limb position. Participants also wore an orthosis around the wrist and hand to promote isometric contractions that would more closely resemble amputee contractions. Finally, a 400g weight was attached to the distal end of the orthosis to simulate the weight of a prosthesis.
Due to the unique size and anatomy of each residual limb, dry electrode setups that are not specifically customized for an amputee are prone to electrode liftoff. Hence, we used wet electrodes for AMP participants to prevent unwanted interface noise (Fig. 1b). Six channels of EMG signals were collected using adhesive Ag/AgCl bipolar electrodes (Bio-Medical Instruments) that were secured under a silicone liner. The electrodes were equally spaced around the subject’s residual limb and the reference electrode was placed just distal to the olecranon. An adjustable lightweight frame was fastened around the residual limb and lengthened to match the subject’s intact limb length. A 400g weight was attached to the distal end of the frame to simulate the weight of a prosthesis.
Data Collection Protocol
All data collection was conducted in an HTC Vive virtual reality environment (Fig. 1c). Each participant collected a training data set and a test data set during one experimental session. EMG signals were sampled at a rate of 1 kHz, band-pass filtered between 70-300 Hz, and segmented into 200 ms windows in 25 ms increments. In addition to a hardware gain of 2 and a software gain of 1000, there were channel-specific software gains that were customized for each subject. These channel gains were calculated by scaling the signals in each channel to span the output range of -5V to 5V. Although the channel gains were calculated using the training data set alone, they were applied to the training and test data sets.
To collect training data, the subject performed hand and wrist gestures while moving their arm around the workspace. This simple training protocol has been shown to achieve high real-time performance [33,34]. All ITL subjects and two AMP subjects completed seven gestures (rest, wrist flexion/extension, wrist pronation/supination, hand open/close), corresponding to a 3DOF controller. Based on clinician input and to minimize fatigue, the remaining four AMP participants completed five gestures (rest, wrist pronation/supination, hand open/close), corresponding to a 2DOF controller. Each gesture was held for 2.5 seconds and repeated five times, resulting in 12.5 seconds (500 overlapping windows) of clean training examples per gesture.
To collect test data, the subject performed the trained hand and wrist gestures in four limb positions (Fig. 2a). Each gesture was held for 2.5 seconds and repeated five times. Therefore, each participant had 50 seconds (2000 overlapping windows) of clean test data for each gesture.
Offline Analyses
After EMG data collection, all further analyses were conducted offline on a Windows 10 laptop computer with 16GB RAM, an Intel Core i7-9850H CPU at 2.60GHz, and a 4GB NVIDIA Quadro T1000 GPU. These analyses included training data augmentation, training five control strategies, testing those strategies, and statistical evaluations.
Training Data Augmentation
We constructed an augmented training data set by systematically corrupting up to four channels in copies of the original raw training signals. The augmented data contained one clean copy of the original data and eight noisy copies. Within the eight noisy copies, there were two copies of data that had one noisy EMG channel, two copies that had two noisy channels, two copies that had three noisy channels, and two copies that had four noisy channels. We evenly distributed 12 types of synthetic noise into all possible channel combinations. These synthetic noise types included flatlining, in which the signal was completely attenuated to 0V, five levels of Gaussian noise centered at 0V (σ = 1,2,3,4,5V), five levels of 60 Hz noise (amplitude = 1,2,3,4,5V), and a randomized mixture of all noise types.
Control Strategies
Five control strategies were trained and evaluated in this study. Before we trained the controllers, four time-domain features were extracted from the training data sets: mean absolute value, waveform length, zero crossings, and slope sign changes.
Traditional LDA Classifiers
Three control methods were based on the traditional LDA classifier algorithm.
1. Baseline LDA (LDA) – To act as the baseline model, an LDA classifier was trained with the original training data set. This algorithm is used in most clinically available PR systems and therefore demonstrates what current prosthesis users experience.
2. Augmented LDA (LDA+) – To investigate how data augmentation affects the reliability of a standard LDA algorithm, we trained an LDA classifier with the augmented training data set.
3. Adaptive LDA (LDA-) – We implemented an existing fast-retraining LDA classifier that circumvented signal disturbances by adjusting its LDA weights after removing noisy EMG channels [9]. First, we trained an LDA classifier with the original training data set and stored the class mean and covariance matrices. During classification, we omitted the elements corresponding to noisy EMG channels from the mean and covariance matrices and recalculated the LDA weights. Then, we removed the noisy EMG channels from the classifier inputs and used the new LDA weights to classify the remaining signals. In practice, this control strategy requires a fault detector to detect noisy signals. However, we excluded this step and instead assumed that a perfect fault detector was used. Thus, the LDA- classifier shows the best-case scenario for an adaptive LDA control system. When there were no noisy channels, the LDA- classifier was identical to the baseline LDA classifier.
Neural Network-Aligned Classifiers
The remaining two control strategies comprised two stages: a latent encoder network that aligned the EMG inputs to a low-dimensional manifold and an LDA classifier that classified these latent variables (Fig. 2d). One control strategy used a multilayer perceptron network (MLP-LDA) while the other used a convolutional neural network (CNN-LDA). Both models were implemented using Keras 2.3.1 with the Tensorflow backend.
We used five-fold cross-validation with the augmented training data set to tune each model’s hyperparameters. To avoid overlapping training and validation data, each fold corresponded to one gesture repetition. After the hyperparameters were determined, we trained the final models with the entire augmented training data set.
Both networks were trained using the Adam optimization algorithm with a learning rate of 0.001 and mini-batch gradient descent with 30 training epochs and a batch size of 128. To accelerate training time, we used a minmax scaler to standardize the input features between [0,1] and applied batch normalization after each hidden layer.
4. Multilayer perceptron-aligned LDA (MLP-LDA) - We trained a fully connected five-layer neural network (Fig. 2b) to take in a 24 by 1 EMG feature vector and output a 4 by 1 latent feature vector z and a predicted gesture label y.
The first four hidden layers aligned the EMG inputs to the latent space. We applied ReLU activation functions after the first three layers and a linear activation function after the fourth layer. Based on our cross-validation results, we found that classification accuracy improved as the dimensionality of the latent space increased but began to plateau after a dimensionality of 4. Thus, we set the latent dimension to 4. We also regularized the weights of the fourth layer with L1 regularization (λ= 10e-5) to encourage sparsity and improve generalization.
The last hidden layer in the MLP was a linear classifier that used a softmax activation function to classify the latent features. The network was trained to minimize the categorical cross entropy loss between the predicted class and the ground truth, thus optimizing linear separability between movement classes in the latent space.
Since neural network classifiers are prone to overfitting, we trained an LDA classifier with the latent features of the augmented data set and used it in tandem with the MLP network to form the MLP-LDA control strategy (Fig. 2d). During classification, the EMG input vectors were passed through the MLP encoder to compute the latent features z, which were then fed to the LDA classifier to obtain gesture predictions.
In total, the MLP had 1267 trainable parameters.
5. Convolutional neural network-aligned LDA (CNN-LDA) - We trained a CNN (Fig. 2c) with the same objectives as the MLP: to output a 4 by 1 EMG latent feature vector z and a predicted gesture label y.
While the inputs for the previous control strategies were 24 by 1 feature vectors, the CNN input was a 6 by 4 feature matrix, corresponding to the 6 EMG channels and 4 time-domain features. This enabled the 2-dimensional convolutional layers to exploit the spatial relationships between EMG channels and learn more robust latent representations.
The first five hidden layers served as the encoder, starting with two convolutional layers with ReLU activation functions. Then, we flattened the output of the convolutional layers before passing it through two sequential layers with a ReLU and a linear activation function, respectively. Thus, the latent encoder modules of the CNN and MLP each had three ReLU functions and one linear function. Like the MLP, the layer preceding the bottleneck was regularized with L1 regularization (λ=10e-5) and the latent space had a dimensionality of 4.
The last layer of the CNN classified the latent feature vector z using a softmax activation function. The CNN was trained to minimize the categorical cross entropy loss between the predicted class and the ground truth, once again to encourage linear separability between the class latent representations.
Finally, we trained an LDA classifier with the augmented training data set after it was aligned by the CNN. The CNN-LDA control strategy (Fig. 2d) used the CNN encoder to compute latent variables z which were then classified with the LDA classifier.
In total, the CNN had 12999 trainable parameters.
Evaluation
To evaluate control performance and robustness, we calculated the offline classification accuracies of each control strategy on clean and noisy test data. Since it was impractical and challenging to introduce interface noise in a controlled manner during data collection, we constructed noisy test data offline by fusing the original test raw signals with examples from a real noise database.
Real Noise Database
The effects of four noise types were investigated in this study: broken wires, moving broken wires, contact artifacts, and loose electrodes. A database containing 25 seconds (1000 overlapping windows) of each type was collected from one ITL subject (Fig. 3). Since the housing of the armband prevented access to individual electrodes, this database was recorded using the wet electrode setup. Although all six channels were recorded, only the affected noisy channel was stored in the database.
To simulate the broken wire and moving broken wire conditions, one wire was cut at the connection point between the wire and the electrode. For the broken wire condition, the subject maintained a 90-degree angle at the elbow throughout data collection. For the moving broken wire condition, the subject moved their arm around freely in a workspace that contained sources of electrical noise, such as monitors and laptops. For the contact artifact condition, the electrode was tapped approximately every 200 ms. Finally, for the loose electrode condition, the electrode was peeled off and gently shifted around the surface of the subject’s skin throughout signal recording.
Fusion of Test Signals and Real Noise
We constructed four noisy test sets, each containing a distinct number of noisy EMG channels (1 to 4 channels). Each noisy set started as a copy of the clean test raw signals. We then systematically superimposed pseudorandomized samples from the real noise database onto the copy, ensuring that all combinations of affected channels and noise types were equally represented. To maintain signal amplification consistency, the subject-specific channel gains were applied to the noise windows according to the channels with which they were being fused. Signals were then truncated to stay within the output range of [-5V, 5V]. Finally, we extracted the four time-domain features from the noisy test signals.
Statistical Analyses
The statistical analyses were conducted separately for ITL and AMP populations. We used linear mixed effects models to evaluate the statistical effects of each control algorithm with respect to the baseline LDA method. Initially, we fit a model with classification accuracy as the response variable, the control strategy (LDA, LDA+, LDA-, MLP-LDA, and CNN-LDA), number of noisy electrodes (0-4), and their interactions as fixed factors, and the subject identifier as a random factor. Statistical significance was judged based on a significance level of α= 0.05. After observing that all interaction factors were statistically significant (p < 0.001), the data were separated by the number of noisy electrodes. These data sets were used to fit five new models that each had the control strategy as a fixed factor and subject identifier as a random factor. We used the Bonferroni method to correct for multiple comparisons.