EMG based Hand Gesture Classification using
Empirical Mode Decomposition Time-Series and
Deep Learning
Deniz Hande KISA, Mehmet Akif OZDEMIR, and Aydin AKAN
Onan GUREN Dept. of Electrical and Electronics Eng.
Department of Biomedical Engineering Izmir University of Economics
Izmir Katip Celebi University Izmir, TURKEY
Izmir, TURKEY
[email protected][email protected],
[email protected],
[email protected] Abstract—Computer systems working with artificial EMG, there is knowledge about which movement takes place.
intelligence can recognize movements and gestures to be used for These signals can be taken with a sEMG electrode that is a
many purposes. In order to perform recognition, the electrical major and noninvasive technique. For control strategy of
activity of the muscles can be utilized which is represented by rehabilitation and myoelectric based devices, EMG signals can
electromyography (EMG) and EMG is not a stationary biological
signal. EMG based movement recognition systems have an
be utilized, and also, they need to be classified [1].
important place in distinct areas like in human-computer The capability to recognize the hand movements of a
interactions, virtual reality, prosthesis, and hand exoskeletons. In computer system or device may be utilized in many potential
this study, a new approach based on deep learning (DL) and
applications which are sign language recognition, robotics,
Empirical Mode Decomposition (EMD) is proposed to improve
the accuracy rate for recognition of hand movements in its
virtual reality, and human-computer interaction (HCI).
application areas. Firstly, 4-channel surface EMG (sEMG) Especially, HCI is significant in military and medical areas that
signals were measured while simulating 7 different hand gestures, include studies of hand gestures such as real-time controlling
which are extension, flexion, ulnar deviation, radial deviation, with sEMG for the prosthesis of individuals, the haptic
punch, open hand, and rest, from 30 subjects. After that, systems, and the exoskeletons [2].
noiseless signals were procured utilizing filters as a result of
preprocessing. Then, pre-processed signals were subjected to In myoelectric controlled based systems, to ensure
segmentation. Thereafter, the EMD process was applied to each movement prediction and recognition, EMG signals need to be
segmented signal and Intrinsic Mode Functions (IMFs) were processed and classified. It has happened essentially to make
obtained. The IMFs time-series which are some kind of screen the classification of EMG signal utilized for control especially
images of the first 3 IMFs have been recorded. For classification, in the multi-grip or upper limb prosthesis. EMG gives
IMFs images have given as inputs and have trained to the 101- information about movement type and where it takes place
layer Convolution Neural Network (CNN) based on Residual systems [3]. Hand gesture recognition can be utilized to predict
Networks (ResNet) architecture, which is a DL model.
the movement in exoskeleton systems to provide more
Keywords—Convolutional Neural Network (CNN), Deep advanced movement prediction to obtain better synchronization
Learning, Electromyography (EMG), Empirical Mode [4]. sEMG based pattern recognition systems have shown good
Decomposition (EMD), Hand Gesture, Intrinsic Mode Function potency for controlling the described systems [3].
(IMF), ResNet.
In order to procure better HCI synchronization, there need
I. INTRODUCTION new techniques to improve the accuracy percentage and the
synchronization capability in myoelectrical control systems in
In the movement system, the human body can move with
movement recognition, particularly in the recognition of hand
the harmonic acting mechanisms of the bones, joints, and
gestures. Many engineering and medical practices for EMG-
muscles. The ability to move is an imperative factor in human
based hand gesture recognition become available [4]. Before
daily lives from communication to actions. The most active
classification and recognition, to analyze the signals, useful
component of that system is the muscle and it gives
methods are needed. For this aim, EMD can be utilized. EMD
information about movement [1]. For this purpose, EMG is
is an analyzing method for non-stationary and nonlinear time-
utilized to investigate the working mechanism of muscles and
series and decomposes the signal into a sequence of swings
their effects on the movement. EMG does not have a
named IMFs. By utilizing distinct time and frequency domain
symmetrical structure and it is nonstationary bio-signal as a
technics, the multichannel biological signal’s IMFs extracted
sum of biopotentials during contraction. It is assumed that in
by EMD. It increases the accuracy rate and demonstrates good
This work was supported by the Scientific and Technical Research Council of
Turkey (TUBITAK) under Grant No. 1919B011903429.
978-1-7281-8073-1/20/$31.00 ©2020 IEEE
Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on September 08,2021 at 17:02:56 UTC from IEEE Xplore. Restrictions apply.
performance than complex and traditional preprocessing the wrist, punch, open hand, and rest position. The recorded
methods [5]. The EMD method is the process of shredding a EMG signals’ amplitude is between 0-10 mV or 0-1.5 mV.
signal without leaving the time domain. EMD is useful for any
To collect data without noise, it has purposed to take data
application that requires filtering EMG signals during the pre-
from surface muscles that were closer to skin. The EMG
processing stage. Previous studies have shown that it can be
signals have usually minor amplitudes. When measured
applied successfully to decrease EMG noise. The main
muscle's number improves, the biological signal's number that
distinction of the method is that it performs signal
can be measured and monitored enhances. According to this, 4
decomposition adaptively and only based on available data and
distinct surface muscles have been selected that the beneficial
uses a prefixed filter set [6]. Additionally, numerous DL-based
and non-noisy signals can be received pending the hand
hand gesture recognition works have been published. Recent
gestures.
DL architectures propound high accuracies (>95%) [7]. So, to
obtain high accuracy, CNN architecture as the DL method was The utilized 4 muscles are flexor carpi radialis, flexor carpi
utilized in this study. ulnaris, extensor carpi radialis, and extensor carpi ulnaris.
After cleaning of the skin by alcohol to remove dead cells and
Sapsanis et al. presented a pattern recognition method to
oils, sEMG electrodes are located on their approximate location
recognize hand movements utilizing sEMG data. They
shown in Fig. 1. The sEMG recording time is taken 515
decomposed EMG signals utilizing EMD into IMFs and then
seconds which included 5 cycles.
features were extracted. They tested their results and the results
committed that EMD could enhance the distinctive capability
of the traditional feature series obtained from raw EMG. For
instance, the average accuracy increased from the raw EMG
extracted features of 86.92% to all extracted features 90.42%
for a subject [8].
Yan et al. presented an effective and efficient combination
of feature extraction and multiclass classifier for motion
classification by analyzing sEMG signals. They introduced
EMD to decompose the EMG signals into a few IMFs and after
that calculated the coefficients of each IMF to effectuate
feature set. They designed the multi-class classifier based on
least squares support vector machines (LS-SVMs) for the
classification of varied movements. Their results demonstrated Fig. 1. Four channels of sEMG electrodes placement.
that the accuracy of movement recognition was developed [9]. B. Pre-Signal Processing and Segmentation
Xiaojing et al. investigated feature extraction and The filtering process was performed and the raw sEMG
classification of sEMG signals. Firstly, they used an signals from volunteers were cleaned from noises. Then noises,
independent component analysis technique to remove the which were caused by external sources or body, during muscle
power frequency interference, and then the processing of low contractions, were filtered with a digital low pass filter between
noise signal was performed by EMD. After that, the 5-500 Hz and a 50 Hz Notch filter. After filtering,
decomposed signal was utilized to establish the autoregressive segmentation was performed and 4-channel sEMG signals
model. They utilized coefficients of the model qua features of were divided into 4-seconds time windows correspondent the
signal and optimized probabilistic neural network for instants when the gesture acting. This process enhanced the
classification of 6 forearm movements. Their results showed number of data needed for DL and additionally, it provided not
that the proposed method was effective for extraction and to miss valued knowledge at the duration of gesture in the
classification [10]. signal. An example of segmented EMG signals pertains to hand
gestures can be seen in Fig. 2. 5 cycles were repeated and
In this study, collected EMG signals are segmented and performed by each participant. For these reasons, total 4-
IMFs are created via the EMD process and recorded the time- channel sEMG segments appertain to each hand gesture can be
series IMFs, which are some kind of screen images of the first calculated as; the number of time-series images = 30
3 IMFs. These images of IMFs are given as input data in the participants x 5 reps x 7 hand gestures x 4 channels x 4 IMFs
CNN structure which is ResNet-101. (one of the original signals’ time-series) = 16800.
II. MATERIALS AND METHOD C. Empirical Mode Decomposition of sEMG Segments
A. EMG Dataset After segmentation, the EMD method was utilized to each
In this study, sEMG signals utilized for hand gesture segmented signal then IMFs were obtained and recorded the
recognition were recorded by the MP36 model BIOPAC device time-series IMFs, which are the time-series snap screens for
with 4-channels. 30 healthy volunteers (15 females and 15 the first 3 IMFs.
males) take place in this study. Their ages were between 18 to sEMG signal is a nonstationary and nonlinear like other
25. The sEMG signals were recorded at 2 kHz sampling signals in nature. So, utilizing algorithms that consider
frequency for 7 different hand gestures utilizing Ag/AgCl linearity and stationarity features may be improper. At this
surface bipolar electrodes. The seven hand gestures (in Fig. 2) point, EMD ensures a new adaptable method nonstationary
are extension, flexion, ulnar deviation, and radial deviation of signal for their analysis [8].
Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on September 08,2021 at 17:02:56 UTC from IEEE Xplore. Restrictions apply.
EMD behaves like a non-linear and adaptive filter, it of x(t). The last residue demonstrates just overall propensities
decomposes the signal into several IMFs. An IMF symbolizes of the signal [12].
a basic oscillating function fulfilling two circumstances [8]:
D. Deep Learning Architecture
i. The zero passing's number and the local extreme’s
The first 3 IMFs, which are time-series snap screens from
number are the same or different by one.
EMD, were given as inputs and have trained to the 101-layer
ii. The local mean is equal to zero, and it is described CNN for classification. The working principle of CNN is based
with the mean of local maximum and local minimum. on the visual cortex and it is designed to imitate the linkage
The two circumstances engage that all minima of an IMF model of neurons in the brain. CNN contains three types of
are negative and all its maxima are positive [8]. The signal x(t) layers, which are convolutional, pooling, and fully connected.
and the EMD algorithm can be abstracted as followings: In CNN, as the layer's number scale up, the network happens
i. Adjusting the whole local maximum and a local difficult for training, and the accuracy arrives satiation and then
minimum of x(t) signal. goes to reduce. Residual learning lends assistance to resolve
that decreasing accuracy issue. Residual learning uses quick
ii. Interpolating between consecutive the local
path linkages qua a training technic to directly assign the input
maximum and minimum via cubic function and not simply to the next adjacent one but withal to another
forming an upper envelope (emax(t)) and a lower subsequent layer, for the network's training. It can be readily
(emin(t)) envelope [11]. explicated as the extraction of input characteristics learned off
iii. Calculating the average of upper and lower that layer and it is performed by ResNet utilizing bypass
envelopes. linkages to every couple of 33 filters. In this way, the issue of
ሺ௧ሻାೌೣ ሺ௧ሻ vanishing gradients is kept away by reusing activation from the
݉ሺݐሻ ൌ (1)
ଶ previous layer.
iv. Subtracting the average from x(t) signal to take the
detail. In this study, 101-layers ResNet-101 was utilized to train
by the time-series images of IMFs obtained from sEMG signals
݀ሺݐሻ ൌ ݔሺݐሻ െ ݉ሺݐሻ (2) recorded pending 7 distinct hand gestures. Its network shares 4
v. If the number of local extremes of dt, is equal to or types of residual learning blocks. ResNet-101 is obtained by
different from the number of zero crossings by one, modifying ResNet-50. In this study, residual network-based
and the mean of d(t) is close to zero, then IMF1= CNN architecture is used to prevent partially similar IMF time-
d(t). Else, repeating steps i to iv on d(t) in place of series snap screens from going to overfitting.
x(t), till recent d(t) supplies the circumstances of an
IMF in step v. III. RESULTS
vi. Calculating residue ݎሺݐሻ ൌ ݔሺݐሻ െ ݀ሺݐሻ. In this study, a total of 4200 time-series images for every
group, which obtained from the original signals, IMF1, IMF2,
vii. If r(t) is over the threshold, then repeating steps i to
and IMF3, were created to train the network and given to feed
vi on r(t) is performed to procure following IMF and
the ResNet-101 network. For every training process total of
novel residue.
4200 images were used to train and this was repeated 4 times.
Finally, n vertical IMFs are procured off which the initial These images were reserved by the validation split method,
x(t) signal can be defined as following [12]. Whence, finally 80% for the training of the network and the remaining 20% for
decomposition of initial x(t) signal is performed and testing the trained model. The training results validation and
decomposed into an aggregate of IMFs plus a residual test accuracy graphs and the confusion matrixes for all test
premises [8]: groups are shown in Fig 3. When the results obtained from the
ݔሺݐሻ ൌ σ ܨܯܫ ሺݐሻ ݎሺݐሻ (3) original signal were examined it was seen that loss value was
calculated as 0.0862, training accuracy as 95.92%, and
Hereafter, IMFs are named as first-order IMFs that derived
validation loss as 0.3019. The validation accuracy of the
utilizing the ordinary EMD process. The highest frequency
original signal was 94.22%. F1 score was calculated as
oscillations in the initial x(t) signal are represented by the first
79.67% and mean squared error was found as 0.322619.
IMF. The following IMFs cover inferior frequency oscillations
Fig. 2. Block representation of this study; seven different hand gestures, a) Open Hand, b) Punch, c) Radial Deviation, d) Ulnar Deviation, e) Extension, f) Rest,
and g) Flexion; an example of segmented sEMG drawings of extension and its EMD time-series output of first three IMFs; ResNet block; and the SoftMax output
representation (The sEMG data used in this figure belongs to the 1 st sEMG channel of the number #4 participant in the database).
Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on September 08,2021 at 17:02:56 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. Network training results, training and validation accuracy for a) using only original signals’ time-series snap screens, b) IMF1 snap screens, c) IMF2 snap
screens, d) IMF3 snap screens, and confusion matrixes for e) using only original signals, f) IMF 1, g) IMF2, h) IMF3, (E: Extension, F: Flexion, O: Open Hand, P:
Punch, R: Radial Deviation, X: Rest, and U: Ulnar Deviation).
For the training of IMF1, it was calculated as 3.3257e-05, REFERENCES
training accuracy value as 100%, and validation loss as [1] M. Y. Iscan and M. Steyn, The human skeleton in forensic
0.0088. The validation accuracy of IMF 1 was 99.73%. Its F1 medicine. Charles C Thomas Publisher, 2013.
score was calculated as 99.05%, and mean squared error was [2] M. J. Cheok, Z. Omar, and M. H. Jaward, "A review of hand
gesture and sign language recognition techniques," International
found as 0.123810. For the training of IMF2, loss value was Journal of Machine Learning and Cybernetics, vol. 10, no. 1, pp.
calculated as 0.0179, training accuracy value as 99.28%, and 131-153, 2019.
validation loss as 0.0312. The validation accuracy of IMF 2 [3] A. K. Mukhopadhyay and S. Samui, "An experimental study on
was 98.86%. Its F1 score was calculated as 96.05% and mean upper limb position invariant EMG signal classification based on
deep neural network," Biomedical Signal Processing and Control,
squared error was found as 0.361905. For the training of IMF3, vol. 55, p. 101669, 2020.
training loss was calculated as 0.0398, accuracy value as [4] J. L. Ren, Y. H. Chien, E. Y. Chia, L. C. Fu, and J. S. Lai, "Deep
98.42%, and validation loss as 0.0563. The validation Learning based Motion Prediction for Exoskeleton Robot Control
accuracy of IMF3 was 97.94%. Its F1 score was calculated as in Upper Limb Rehabilitation," in 2019 International Conference
on Robotics and Automation (ICRA), 2019, pp. 5076-5082: IEEE.
93.54% and mean squared error was found as 0.720238. [5] E. Izci, M. A. Ozdemir, R. Sadighzadeh, and A. Akan,
"Arrhythmia detection on ECG signals by using empirical mode
IV. CONCLUSION decomposition," in 2018 Medical Technologies National Congress
In this study, IMFs of sEMG signals obtained from 7 (TIPTEKNO), 2018, pp. 1-4: IEEE.
[6] A. O. Andrade, S. Nasuto, P. Kyberd, C. M. Sweeney-Reed, and F.
different hand gestures were created via EMD. Screenshots of Van Kanijn, "EMG signal filtering based on empirical mode
the first 3 IMFs were used to train the ResNet-101 architecture. decomposition," Biomedical Signal Processing and Control, vol. 1,
To compare the success of IMFs, screenshots of original sEMG no. 1, pp. 44-55, 2006.
signals were trained in CNN based architecture. [7] G. Jia, H. K. Lam, J. Liao, and R. Wang, "Classification of
Electromyographic Hand Gesture Signals using Machine Learning
When the training results are examined, we can say that all Techniques," Neurocomputing, 2020.
[8] C. Sapsanis, G. Georgoulas, A. Tzes, and D. Lymberopoulos,
training results have achieved very effective accuracy in "Improving EMG based classification of basic hand movements
classifying hand movements. Besides, time-series images using EMD," in 2013 35th Annual International Conference of the
obtained from IMF1 are more successful than other groups. IEEE Engineering in Medicine and Biology Society (EMBC), 2013,
Moreover, the original signal gave the worst results and it pp. 5754-5757: IEEE.
[9] Z. G. Yan, Z. Z. Wang, and X. M. Ren, "Joint application of
further proves the success of IMFs by EMD. Also, the reason feature extraction based on EMD-AR strategy and multi-class
why this study provides accurate results as mentioned is that classifier based on LS-SVM in EMG motion classification,"
the idea of processing time snaps in the DL model is simple Journal of Zhejiang University-SCIENCE A, vol. 8, no. 8, pp.
and effective. 1246-1255, 2007.
[10] S. Xiaojing, T. Yantao, and L. Yang, "Feature extraction and
As a conclusion, EMD is a useful and feasible process to classification of sEMG based on ICA and EMD decomposition of
apply sEMG signals, which shows the nonstationary and AR model," in 2011 International Conference on Electronics,
Communications and Control (ICECC), 2011,pp.1464-1467: IEEE.
nonlinear structure, and this application provides to obtain [11] S. Yol, M. A. Ozdemir, A. Akan, and L. F. Chaparro, "Detection of
more information from these signals and obtain better accuracy epileptic seizures by the analysis of EEG signals using empirical
rate results than raw or original sEMG for recognition of hand mode decomposition," in 2018 Medical Technologies National
movements. In future works, by using larger datasets and Congress (TIPTEKNO), 2018, pp. 1-4: IEEE.
[12] R. Damaševičius, M. Vasiljevas, I. Martišius, V. Jusas, D.
distinct hand gestures the proposed method can be developed. Birvinskas, and M. Wozniak, "BoostEMD: an extension of EMD
In this way, the applicability of the method can be increased. method and its application for denoising of EMG signals,"
Elektronika ir elektrotechnika, vol. 21, no. 6, pp. 57-61, 2015.
Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on September 08,2021 at 17:02:56 UTC from IEEE Xplore. Restrictions apply.