Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
36 views10 pages

Comparative Analysis of Spectral Approac

This document compares different approaches to feature extraction for EEG-based motor imagery classification. It analyzes spectral estimation methods, atomic decompositions, time-frequency distributions, and wavelet approaches to extract band power features for motor imagery classification using linear and nonlinear classifiers. The study finds that power spectral density techniques demonstrate the most robustness and effectiveness in extracting distinctive spectral patterns to accurately discriminate between left and right motor imagery.

Uploaded by

shawon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views10 pages

Comparative Analysis of Spectral Approac

This document compares different approaches to feature extraction for EEG-based motor imagery classification. It analyzes spectral estimation methods, atomic decompositions, time-frequency distributions, and wavelet approaches to extract band power features for motor imagery classification using linear and nonlinear classifiers. The study finds that power spectral density techniques demonstrate the most robustness and effectiveness in extracting distinctive spectral patterns to accurately discriminate between left and right motor imagery.

Uploaded by

shawon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO.

4, AUGUST 2008 317

Comparative Analysis of Spectral Approaches


to Feature Extraction for EEG-Based Motor
Imagery Classification
Pawel Herman, Member, IEEE, Girijesh Prasad, Senior Member, IEEE, Thomas Martin McGinnity, Member, IEEE,
and Damien Coyle, Member, IEEE

Abstract—The quantification of the spectral content of elec- aimed at replacing the impaired functionality of a human neuro-
troencephalogram (EEG) recordings has a substantial role in muscular system with computer-based analysis of the EEG and
clinical and scientific applications. It is of particular relevance establishing a communication channel independent of periph-
in the analysis of event-related brain oscillatory responses. This
work is focused on the identification and quantification of relevant eral nerves and muscles [4]. Most BCIs utilize EEG features that
frequency patterns in motor imagery (MI) related EEGs utilized are well-defined physiologically, which include oscillations in
for brain–computer interface (BCI) purposes. The main objective neuronal circuits or potentials evoked by particular stimuli [4].
of the paper is to perform comparative analysis of different ap- In this work, lateralized changes in the signal’s spectral content
proaches to spectral signal representation such as power spectral within (8–12 Hz) and (18–25 Hz) sensorimotor rhythms due
density (PSD) techniques, atomic decompositions, time-frequency
(t-f) energy distributions, continuous and discrete wavelet ap- to imagination of left- or right-hand movement are exploited for
proaches, from which band power features can be extracted and MI discrimination. The underlying brain phenomena is referred
used in the framework of MI classification. The emphasis is on to as event-related de-synchronization (ERD) and event-related
identifying discriminative properties of the feature sets repre- synchronization (ERS) [1], [3].
senting EEG trials recorded during imagination of either left– or The task of EEG spectral quantification is particularly
right-hand movement. Feature separability is quantified in the
offline study using the classification accuracy (CA) rate obtained challenging considering the complexity of the dynamics of
with linear and nonlinear classifiers. PSD approaches demonstrate nonstationary EEG [1]. It is required that the time variation of
the most consistent robustness and effectiveness in extracting the the relevant frequency components is accounted for. Statistical
distinctive spectral patterns for accurately discriminating between spectrum estimation methods, both parametric [5] and nonpara-
left and right MI induced EEGs. This observation is based on an metric [6], [7], have been extensively utilized for EEG analysis
analysis of data recorded from eleven subjects over two sessions
of BCI experiments. In addition, generalization capabilities of the in studies of MIs. Digital filtering, commonly applied in BCIs,
classifiers reflected in their intersession performance are discussed e.g., [8], represents a different concept of feature extraction.
in the paper. Instead of estimating spectral content of signal segments,
Index Terms—Alternative communication, brain–computer digital filters select their rhythmical components depending on
interface (BCI), electroencephalogram (EEG), spectral analysis, the frequency band of interest. Joint time-frequency (t-f) and
time-frequency (t-f) analysis, wavelet transforms. time-scale (t-s) techniques are considered to be particularly
well suited for analysis of EEGs as representative examples of
signals with time-varying spectral content [2]. Hence, quadratic
I. INTRODUCTION t-f distributions and wavelet-based methods have also been
HE electroencephalogram (EEG) is one of the most clin- exploited in BCI, e.g., [9], [10].
T ically and scientifically exploited signals recorded from
humans. Hence, its quantification plays a prominent role in brain
EEG feature extraction approaches are clearly dominated
by methods estimating the signal’s energy distributed in the
studies [1]. In particular, the examination of the frequency con- frequency, t-f or t-s domains. However, to the best of authors’
knowledge, a comparative analysis of a sizeable set of these
tent of EEG signals has been recognized as the most prepon-
derant approach to the problem of extracting knowledge of the methods applied to the same data in the BCI framework using
brain dynamics [1], [2] and thus it has been heavily exploited in the-state-of-the-art linear and nonlinear BCI classifiers has not
the study of the brain phenomena related to motor imagery (MI) been undertaken yet. Most comparative evaluations reported in
the literature are confined to a small number of techniques or only
[3]. The outcome of this study is crucial for successful devel-
opment of brain–computer interface (BCI) technology, which is one classifier [8], [11], [12]. Alternatively, studies involving an
examination of different feature types are concerned with the
problem of their selection from multichannel EEG recordings,
Manuscript received April 15, 2007; revised July 25, 2007; accepted Feb-
ruary 16, 2008. First published June 3, 2008; last published August 13, 2008
which does not provide much insight into the relation between
(projected). The work of P. Herman was supported by the Vice-Chancellor’s the nature of the given t-f signal representation and discrimi-
Research Scholarship at the University of Ulster. native properties of the features generated [7], [13]. Therefore,
The authors are with the Intelligent Systems Research Centre, School of Com- this paper is aimed at a more systematic analysis of different
puting and Intelligent Systems, University of Ulster, Derry, BT48 7JL, U.K.
(e-mail: [email protected]; [email protected]). approaches to quantifying the frequency content of the EEG trials
Digital Object Identifier 10.1109/TNSRE.2008.926694 recorded for BCI purposes. They are categorized into four groups:
1534-4320/$25.00 © 2008 IEEE
318 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

spectral estimation methods, atomic decompositions, quadratic


t-f distributions, and wavelet-based techniques. Wavelets are
considered in this work despite the fact that they map a signal into
the t-s domain [2] since the scale can be interpreted and related to
the frequency domain. All the techniques are applied within a uni-
form feature extraction procedure to quantify the signal energy
in the frequency bands of interest. For comparative purposes,
autoregressive (AR) based feature extraction is employed, where
the AR coefficients are used as features of the EEG signal. AR
modeling approach has attracted a lot of interest in BCI (e.g., [12])
since it does not involve meticulous tuning of subject-specific
Fig. 1. Data recording in a) Graz BCI paradigm [12] and b) BCI basket
frequency bands. Both linear and nonlinear approaches to the MI paradigm.
classification task are adopted to investigate the separability of the
features. Linear discriminant analysis (LDA) [14], regularized
Fischer discriminant (RFD) [15], and support vector machines
sampled at a frequency of 128 Hz and band-pass filtered in
(SVMs) [16] with linear and nonlinear kernels are employed
the range 0.5–30 Hz. The experimental settings for recording the
as classifiers. In this offline study, classification is carried out data under consideration are described in greater detail in [19].
discretely at the end of every signal trial, as in [9] and [17], and The second EEG dataset was acquired at the Intelligent Sys-
similar to the delayed mode in [18]. Classification accuracy tems Research Centre, University of Ulster at Magee using the
(CA) serves as the primary performance measure. same g.tec equipment as that used by the Graz BCI group. The
The major outcome of the comparative analysis reported in EEG data were obtained from eight subjects over ten
this work is the observation that power spectral density (PSD) (for subjects ) or fourteen (for subjects and )
approaches are the most robust and consistent in extracting the 160-trial (balanced) sessions. Two consecutive sessions for each
distinctive MI-related EEG spectral patterns over two BCI ses- subject were arbitrarily selected for extensive offline analysis re-
sions for eleven subjects. Moreover, on average, the linear clas- ported in this paper. Trials were sampled at and those
sifiers have shown superior generalization capability to that of with distinctive muscle or eye artifacts were removed by trained
the SVMs with Gaussian kernels. staff. The subjects were asked to imagine moving the left- and
The paper is organized as follows. Section II outlines the the right-hand depending on the horizontal location (left/right)
methodology used with emphasis on the techniques applied to of a target basket displayed at the bottom of a monitor screen
extract spectral information from the EEG. The results of the [Fig. 1(b)]. Each trial was 7 s in length. A ball was displayed at
comparative analysis of different combinations of the EEG fea- the top of the screen from to . In the meantime, at
tures and the BCI classifiers are demonstrated in Section III and acoustic stimulus signified the beginning of a trial and then
discussed in Section IV. Conclusions and scope of future work the baskets (target in green and nontarget in red) were displayed
are presented in Section V. at . At the ball started moving to the bottom of the
screen. The segment of the data recorded after of each trial
II. METHODS was analyzed in this work. The horizontal component of the ball
movement was continuously controlled in online experiments
A. Data Acquisition by the subject via the biofeedback mechanism. However, as
mentioned, the comparative EEG analysis discussed in this paper
There are two sets of EEG data utilized in this work. The first
was carried out offline with classification performed at the end
dataset was obtained from the Institute of Human-Computer In-
of every trial. This approach with delayed classification until the
terfaces, Graz University of Technology. The EEG signals were
entire trial is processed resembles the concept of single MI-re-
recorded from three subjects (S1, S2, and S3) in a timed exper-
lated EEG trial discrimination [9], [17] as opposed to continuous
imental recording procedure where the subjects were instructed
classification adopted in online experiments.
to imagine moving the left and the right hand in accordance with
The EEG datasets obtained from two different BCI labora-
a directional cue displayed on a computer monitor [Fig. 1(a)].
tories, the Graz and ISRC BCI, were evaluated in the same
Each trial was 8 s in length. A fixation cross was displayed from
framework in this work as they share the same characteristics
to . The beginning of a trial was marked by
accounted for in offline analysis. In addition, since the Graz
acoustic stimulus at . Next, an arrow (left or right) was
recordings were acquired from subjects with more experience
displayed as a cue at . Therefore, the segment of the
and longer history of participation in BCI experiments when
data recorded after of each trial is considered as event
compared to the participants of the recent study conducted in
related and is used for offline analysis. The recordings were
the ISRC BCI laboratory, this mixture of EEG datasets allowed
made with a g.tec amplifier and AgCl electrodes over two con-
for capturing general trends in the performance of spectral EEG
secutive sessions, each session consisting of 140 trials for S1
feature extraction methods and BCI classifiers.
and 160 trials for S2 and S3 with equal number of trials repre-
senting two classes [19]. Two bipolar EEG channels were mea-
B. Feature Extraction Procedure
sured over C3 and C4 locations (two electrodes placed 2.5 cm
anterior and posterior to positions C3 and C4) according to the The frequency bands in which ERS and ERD occur providing
international standard 10/20 system [1]. The EEGs were then discriminative information for classification vary from subject
HERMAN et al.: COMPARATIVE ANALYSIS OF SPECTRAL APPROACHES TO FEATURE EXTRACTION 319

to subject [3], [8]. The resultant characteristic changes in the function of the signal or using Welch’s approach
EEG spectral content around (8–12 Hz) and central (18–25 , referred to as the modified periodogram [20],
Hz) ranges depend on the associated neural structure and the consisting in averaging multiple spectral estimates for shorter
locations where the neurons are active during MI. The experi- data segments and windowing. The average power in a given
ments showed that for the eleven subjects examined, there was frequency band was obtained from a rectangle approximation
ERD of the rhythm on the contralateral side and a slight ERS of the integral of the signal’s PSD. This led to the calculation
in the central rhythm on the ipsilateral hemisphere. This hemi- of in (2).
spheric asymmetry reflected in the EEGs is exploited to differ- 2) Atomic Decompositions: A t-f representation produced
entiate between the MIs. by methods falling into this category has the form of a linear
The quality of the spectral representations examined in this composition of elementary components referred to as atoms. In
paper is assessed in terms of distinctive properties of the fea- this work, two techniques were applied, the discrete short-time
tures obtained from windowed EEG signals. To this end, the FT (STFT) and the S transform. The major limitation of the
EEG trials of length were divided into windows along the STFT is a trade-off between time and frequency resolutions de-
time axis, depending on the settings of two parameters: window pending on the size of the window along the time axis
size, , and the amount of overlap, . The number of [1]. Here, a Gaussian window was applied. Since it is kept con-
windows could be calculated, as shown below stant throughout the signal analysis, the balance between time
and frequency localization capabilities should be determined
(1) beforehand.
The S transform combines the separate strengths of the STFT
Next, relevant portions of information corresponding to spectral and wavelet transform [21]. First, the signal is decomposed into
correlates of ERD/ERS phenomena were extracted separately t-f atoms and second, since the analysis window changes with
from each EEG window within a trial using the techniques ex- frequency it can provide satisfactory resolution in both domains
amined in this paper. The frequency components related to ERD of interest allowing for better time localization at higher fre-
and ERS were merged together to reduce the dimensionality of quency components. The discrete formula of the S transform is
the feature space (as explained in Sections II-B1–II-B5). Closer the following:
analysis performed at initial stages using separate correlates of
ERD and ERS revealed instability and variability of ERS pat- (3)
terns in frequency bands. Therefore, using each feature as a
separate dimension of the feature space was abandoned. As a where is a sampled signal (with samples) and
result, a feature vector representing a BCI trial was composed is a discrete form of the window . It is analogous
of elements given two recording to that of the STFT with the difference in the form of the
channels, Gaussian window , which is additionally parameterized with
the oscillations frequency
(2)
(4)
The most reactive frequency bands from which to extract fea-
tures for the given subjects were tuned by optimizing the CA The concepts of wavelet and STFT analysis in S transform are
evaluated with an LDA classifier in a five-fold cross-validation incorporated by separating the kernel into two parts: the slowly
(CV) procedure. varying frame and the oscillatory exponential element which
1) Spectral Estimation Methods: Spectral density methods chooses frequency bins being localized as in the STFT [21].
extract information from a signal as a stochastic process to Unlike the frame , the oscillatory component is not translated
describe the distribution of its power in the frequency domain. over the time axis, which allows for independent localization of
The PSD is defined as the Fourier transform (FT) of the signal’s the phase and the amplitude spectrum. The implementation of
autocorrelation function provided that the signal is stationary the decomposition methods was done using algorithms based on
in a wide sense [20]. In practice, a comprehensive statistical the fast FT (FFT) [2].
characteristic of the random signal is not available and its In order to determine the feature vector components, s in
spectral content can only be estimated from a sequence of time (2), the amplitudes of the atoms representing an EEG window
samples. Parametric approaches work on the assumption that were time averaged over its length, , and then the square
the underlying process can be described parametrically as an norm of these mean values within the frequency bands of
output of a linear system driven by white noise. In this work, interest (adjusted and ) was calculated (cf. Fig. 2).
the coefficients of AR model were estimated from segments 3) Quadratic Energy Distributions: Unlike the linear atomic
of EEG time series using the Yule–Walker algorithm [20] and decomposition that deals with elementary t-f components, the
the corresponding PSD was then derived from its frequency energy-based approach consists in distributing the energy of the
response. This parametric spectral estimation technique is signal in the joint t-f domain. The Wigner–Ville distribution
referred to throughout the paper as . (WVD) served as a starting point in this research. It belongs,
The two other methods allowed for nonparametric estima- like other transforms studied here, to the Cohen’s class of the
tion of the signal’s PSD from its periodogram [20]. It was quadratic t-f representations based on second-order moments of
either calculated as the FT of the windowed autocorrelation the signal, covariant by shifts in time and frequency [2], [22].
320 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

Formulation of in the space with new relative coordinates,


Doppler and delay (corresponding to the and dimensions)
facilitates representing the original signal in the ambiguity plane
with the cross terms shifted towards the origin. This allows for
interpretation of the function in (6) as a low-pass filter that
attempts to reject the cross terms and leave the signal terms un-
changed. The shape of the filter can be adapted to the analysis of
the desirable signal components depending on the nature of the
signal. For RIDs, where is a monotonously
decreasing function such that to ensure that the param-
eterization function is a low-pass filter in the ambiguity plane.
Depending on the canonical form of , the resultant RIDs have
different cross term rejection capabilities. In this work, three RID
distributions, Choi–Williams , Born–Jordan ,
and Zhao–Atlas–Marks , were examined due to
their successful applications in the EEG analysis [2]. Similar
to the SPWVD, the filters for , , and
were parameterized with the length of the time and frequency
smoothing windows and .
Fig. 2. Schematic presentation of the EEG feature extraction procedure with The EEG feature vector components, s in (2), were ob-
atomic decomposition techniques and quadratic energy distribution approaches. tained carrying out analogous operations to the procedure used
for atomic decompositions (cf. Fig. 2). In consequence, the
square norm of the time-averaged energy components within
For a discrete time analytic signal (using the same symbol the frequency band of interest served as . The quadratic
nomenclature as in (3) with being the conjugate signal), transformations were performed on segments of an EEG trial
the WVD transform is theoretically defined as [23] using the implementation available in [22].
4) Wavelet-Based Approaches: Wavelet-based methods have
been heavily exploited in EEG signal analysis (for a review,
refer to [2]) and, consequently, in BCI research, e.g., [9], [10],
(5) to capture the spectral dynamics of EEG trials. Their computa-
As a bilinear transform, the WVD introduces cross terms be- tional speed and capability to discriminate both the temporal
tween energy components in different t-f regions according to and spectral domain features of signals are important assets
the quadratic superposition principle [2]. The resultant interfer- [1]. The concept of a multiscale approximation associated with
ence terms may overlap with the desirable auto-terms thus likely the wavelet transform (WT) allows for effective localization of
leading to misinterpretation. This can be alleviated by the win- signal components with various spectro-temporal characteris-
dowing operation over the time axis. It is equivalent to frequency tics. In this light, signal decomposition into a linear combina-
smoothing of the WVD (with a parameter as the window tion of shifted and scaled versions of the mother wavelet can
length) and results in the pseudo WVD (PWVD) [22]. In conse- be considered as its harmonic analysis with multiband filters of
quence, at the cost of lower t-f resolution, the interferences are varying widths. Hence, wavelet-based approaches do not suffer
attenuated. More flexibility in the smoothing can be gained by the t-f resolution trade-off inherent to the STFT and thus more
separable smoothing in time and frequency (with and naturally lend themselves to analysis of nonstationary time se-
as the length of the respective windows), which leads to the ries like EEGs. In this work, two WT-related methods were uti-
smoothed-pseudo WVD (SPWVD) [22]. It facilitates reaching lized: a generalized form of discrete wavelet decomposition, re-
the compromise between the decrease in the t-f resolution and ferred to as a wavelet packet (WP) transform (WPT), and the
clearer readability of the energy map due to the attenuation of discrete-time form of the continuous CWT.
the interferences. WPT offers flexibility in decomposing the signal into fre-
Another group of the quadratic t-f representations examined quency bands of various sizes and concise representation within
in this paper are reduced interference distributions (RIDs). a dyadic set of the wavelet scales and translations. It is particu-
They apply cross terms filtering to the WVD in the so-called larly important in the analysis of EEGs for BCI purposes since it
ambiguity plane [2]. In particular, RIDs as members of the facilitates suitable adjustments to the reactive ERD/ERS bands
Cohen’s class can be represented using the 2-D FT of the for individual subjects. This need dictated the basis selection
WVD, , and the parameterization function in for the WP tree in this work. Appropriate WP coefficients at
the following continuous form: scale levels corresponding to the frequency bands of interest
were squared and summed to represent the energy of the decom-
posed segment of an EEG trial in the specific ERD/ERS bands.
The sum served then as the EEG feature vector component .
Symlet (sym) and Daubechies (db) mother wavelets of different
(6) orders were used.
HERMAN et al.: COMPARATIVE ANALYSIS OF SPECTRAL APPROACHES TO FEATURE EXTRACTION 321

TABLE I TABLE II
LIST OF SPECTRAL TECHNIQUES USED IN FEATURE EXTRACTION LIST OF THE CLASSIFIERS APPLIED IN THE STUDY

kernels, as listed in Table II. The CA rate was used as a quantifier


In the second WT-related approach adopted in this work, of the distinctive properties of the features.
the redundancy of the CWT was exploited to enhance local- LDA is a Bayes optimal classifier provided that the distri-
ization of the relevant spectral components of the signal. The butions of features in each of two classes are normal with the
discrete-time complex Morlet wavelets with the scales finely same covariance matrix [14]. This assumption was fulfilled for
adjusted to the individualized ERD/ERS bands were applied to only one feature set (S1). A linear separating hyperplane was
EEG segments. This was performed using the relation between determined as a result of maximizing the ratio of the interclass
the scale, the main receptive frequency, the effective frequency variance to the intraclass variance. To enhance robustness of this
width and the characteristic eigenfrequency of the wavelet [10]. linear classifier, its regularized version, RFD, was also applied.
As a result, every time point within a signal segment was asso- Regularization facilitated the reduction of the disadvantageous
ciated with two CWT coefficients, and , corresponding effects of outliers and strong noise on the generalization ability
to the desirable frequency bands within and . They were of the classifier [15]. Since RFD is formulated as a quadratic
squared and time averaged over the window length in each band. programming (QP) problem [15], it was implemented using the
The square norm of the two resultant quantities, and , Matlab QP solver. Unlike an LDA classifier, RFD requires one
was used then as the EEG feature vector component, regularization parameter, which was selected using a five-fold
CV procedure [25].
.
SVMs have also been heavily used in BCI research [11], [26].
5) AR Model: The AR parameterization of a signal, which
This machine learning approach based on the idea of struc-
gives rise to a maximum entropy spectral estimate, has been
tural risk minimization [27] has proven to perform well in bi-
heavily exploited in EEG modeling (for a review, refer to [1])
nary classification tasks offering exceptional generalization ca-
and applied in BCI research to date, e.g., [11]. In consequence,
pability [16], which is ensured as a result of searching for a sep-
the approach consisting in representing quasi-stationary seg-
arating hyperplane with the largest margin between classes [27].
ments of the EEG trials by a set of AR coefficients was em-
In the so-called soft version of an SVM [16], costs of misclassi-
ployed in this work for comparative purposes. From a number
fication of a portion of data points in a certain neighborhood of
of available estimation methods (for a review, refer to [24]) the
the decision boundary are neglected. The size of the neighbor-
Yule–Walker algorithm [20] was chosen to determine param-
hood is controlled by a corresponding regularization parameter
eters of an AR model of a given order . An EEG feature
that determines the trade-off between the training error and
component corresponding to an EEG trial segment has the form
the size of the margin between classes. Its selection is crucial
of the subvector consisting of coefficients
for the effective setup of an SVM classifier. The separating hy-
perplane is found by solving a QP problem with one global op-
timum [16]. Depending on the formulation of the optimization
(7) problem, different SVM variants can be adopted. In this work,
All the spectral techniques discussed above are listed in so-called L1-SVM and L2-SVM were investigated first. They
Table I with their acronyms and sets of adjustable parameters. refer to SVMs with linear ( -norm) and quadratic ( -norm) pe-
nalization of misclassified examples outside the given neighbor-
C. MI Classification hood [16]. They were implemented using the SVM-KM toolbox
The aim of BCI classification is to assign EEG trials to the [28]. For comparison, a least squares SVM (LS-SVM) was also
classes of the associated mental tasks. The separability of the applied here using the implementation in [29]. It is a reformula-
EEG features extracted with the techniques being compared was tion of standard SVMs that leads to a less computationally de-
analysed using LDA, RFD, and SVMs with linear and nonlinear manding set of linear equations than a QP problem [29].
322 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

One of the strengths of SVMs exploited in this work is the than those originally selected by the GAs, which confirmed the
fact that they are kernel machines, thus allowing for straight- desired optimality of the GA solutions in the given regions of
forward transformation into nonlinear classifiers [16]. A non- the SVM parameter space.
linear kernel function is implicitly utilized to map the original
input space to a higher dimensional feature space in which the
problem is more likely to be linearly separable. Two nonlinear III. RESULTS
kernels were employed—Gaussian and polynomial [16]. How- The process of parameter selection for each of the feature
ever, due to the poor performance of SVMs with the polyno- extraction methods was conducted first. The most reactive fre-
mial mapping when applied to the MI classification, only the quency bands, a setup of the windowing procedure ( ,
following Gaussian kernel function was used: ) and a set of adjustable parameters of the spectral techniques
listed in Table I were optimized using a ten-run five-fold CV
(8) procedure on Session I data with the average CA as a perfor-
mance measure. A preliminary study showed consistent trends
It is emphasized that the kernel is inhomogeneous if the of these parameter setups across different classifiers, which al-
components of the -dimensional vectors, lowed for further tuning to maximize the interclass feature sep-
and , have different scaling factors arability quantified using LDA only.
. Next, the overall separability of the EEG features was eval-
Selecting an optimal parameter set poses a considerable dif- uated on Session I data in terms of the discrimination error es-
ficulty but is crucial for effective utilization of SVMs. An SVM timate made using the inner–outer CV scheme (five-fold splits
classifier is controlled by multiple parameters such as the regu- with multiple runs) with all the classifiers under consideration.
larization constant and kernel parameters s, which need to To this end, Session I data was divided into five outer folds and
be carefully chosen to minimize the generalization error [27]. the classifiers’ parameters were optimized in the GA framework
In this work, it was estimated using a CV scheme and a theo- only on the training set, inner fold (again five-fold inner CV
retical radius-margin bound [30]. The search for an optimal set split), of each outer split. In addition, both the outer and inner
of the parameters was performed with a genetic algorithm (GA) five-fold splits were performed in multiple runs after random
due to its ability to efficiently solve optimization problems given data shuffling. The final overall CA rate was obtained as the
vast, multidimensional and highly irregular search spaces [31]. outer-test-fold average across multiple runs.
A roulette wheel selector along with a two-point crossover (with Analogously to the selection procedure used for spectral pa-
probability rate ) and binary mutation rameters, another ten-run five-fold CV procedure was utilized
operators were utilized. Through a smaller-scale pilot study em- (without inner–outer fold splits) within the framework of the
ploying micro-GAs, this setup was found to deliver satisfactory GA to identify the parameter setup of the SVM/RFD classi-
performance in terms of computational time and the quality of fiers for use in the second experiment—a single-pass test on
the resultant SVM classifier. Moreover, an elitist strategy was Session II data. This experiment facilitated verification of the
applied with a replacement rate of (for the population session-to-session generalization performance. It is crucial for
size of ). The GA search was halted after BCI systems since the dynamics of EEG recordings acquired
generations due to a considerable drop in the convergence rate. at distant times (different sessions) is known to vary consider-
Two variants of the objective function to be minimized were ably. These variations degrade the intersession performance of
used. The first was based solely on the average rate of the clas- BCIs [32]. A key question is to what extent the spectral methods
sification error evaluated using multiple-run (here, 10) five-fold applied in this work capture the most salient and persistent (ses-
CV whereas the second was defined as a weighted sum of this sion-to-session) characteristics of EEGs in the context of MI
CV error estimate and the theoretical upper bound on the gener- classification. Generalization capabilities of the classifiers play
alization error referred to as a radius-margin bound. The out- an important role in this regard also. The test of session-to-ses-
come of their application is further discussed in Section IV. sion performance was conducted using the unseen Session II
An extensive examination of the stochastic nature of the GA data. To this end, optimal configurations of the parameters iden-
optimization procedure facilitated the observation that ten ran- tified using CV on Session I data were kept unchanged. The clas-
domly initiated runs of the algorithm resulted in low variance sifiers were then trained on the features extracted from Session
average values of both objective functions applied in the exper- I and tested on those from Session II.
iments. Moreover, after ten runs of the GA search scheme it be- The results of this multisubject study evaluated using the
came clear that the regions of the solution space explored by inner–outer CV over Session I and the single-pass test on
the algorithm at final stages of the optimization process started Session II are presented in Fig. 3(a) and (b), respectively. The
to overlap to a considerable degree. Therefore, GA-based algo- CA rates were averaged over eleven subjects for every tech-
rithm was run ten times to identify the desirable parameter setup nique and classifier type. For SVMs, the classifiers performing
of an SVM classifier for each given feature set. In addition, gra- best on Session I in each category, linear and nonlinear, were
dient-based minimization of the estimate of the generalization considered as and . It was noted that the
error bound [30] was conducted for the resultant SVM designs Session I performance estimated with the inner–outer CV
following GA search to enhance local optimization capabilities scheme [Fig. 3(a)] showed similar trends to the mean CAs
of the algorithm. It did not lead however to finding better so- obtained from the ordinary ten-run five-fold CV during the
lutions in terms of the CV estimate of the generalization error selection of the classifier’s parameters (not reported here).
HERMAN et al.: COMPARATIVE ANALYSIS OF SPECTRAL APPROACHES TO FEATURE EXTRACTION 323

Fig. 3. (a) Mean CA rates obtained on Session I data using the inner–outer Fig. 4. Comparative analysis of the categories of frequency techniques in terms
CV scheme (five-fold splits with multiple runs) and averaged over 11 subjects. of the average CA rates obtained over 11 subjects with different classifiers: (a)
Vertical lines denote the intersubject standard deviations of the respective mean on session I data (inner–outer CV scheme) and (b) on session II data (single-pass
CA values. (b) Average CA rates obtained in the single-pass test on Session II test). Vertical lines denote the intersubject standard deviations of the respective
data for 11 subjects. Vertical lines denote the intersubject standard deviations of mean CA values.
the respective mean CA values.

TABLE III
Fig. 4(a) and (b) depicts the average classification perfor- LIST OF THE AVERAGE AND BEST CA RATES OBTAINED FOR EACH SUBJECT
mance associated with different categories of frequency tech- ACROSS ALL THE CLASSIFIERS IN SESSIONS I AND II. OPTIMAL FEATURE
TYPES CORRESPONDING TO THE BEST RESULTS ARE PRESENTED IN
niques: atomic decompositions, WV energy distributions, RIDs, COLUMNS “FEATURE” FOR BOTH SESSIONS
PSD estimators, t-s approaches, and AR modeling. The resultant
mean value for each category shown in the form of a bar was cal-
culated for every subject as the average CA rate obtained with
all the classifiers and the category-specific features. Again, the
mean and the standard deviation across the subjects are shown
for Session I and Session II. In addition, Table III lists the best
and average CA rates obtained for each subject across all the
classifiers. It illustrates intersubject differences in performance
on Session I and Session II data.
A comparative analysis of the performance of the classifiers
employed in this work, including all types of SVMs listed in
Table II, was also conducted. The corresponding CAs were av-
eraged over the spectral methods and their means obtained for
eleven subjects are presented along with the standard deviations
in Fig. 5 for Session I and Session II data.

IV. DISCUSSION
The results of this multisubject study were processed in-
dependently for two sessions of the data in the framework
of two-way analysis of variance (ANOVA) with repeated
measures [33]. This approach facilitated effective accounting factors were frequency technique (14 types) and classifier,
for rather high intersubject variance. The two within-subject represented by LDA, RFD, and two collective categories of
324 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

, and produced clearly the least separable


features in their respective categories [Fig. 3(a) and (b)].
As can be seen from Table III, optimal spectral feature ex-
traction techniques vary depending on a subject for Session I
data whereas when session-to-session performance is consid-
ered, PSD approaches, mainly and , clearly
dominate. This confirms the overall superior performance of
PSD methods with emphasis on their intersession generaliza-
tion properties.
Fig. 5 facilitates a more detailed comparative analysis of the
performance of the classifiers listed in Table II, conducted to
complement the outcome of the ANOVA tests. It can be ob-
served that the highest CA rates in the CV evaluation on Session
Fig. 5. Comparative analysis of the classifiers’ performance averaged over 11 I data were obtained with - and - clas-
subjects across all the feature extraction methods in each session (inner–outer
CV scheme with multiple runs of five-fold splits on Session I and single-pass
sifiers, most often playing the role of the optimal .
test on Session II). Vertical lines denote the intersubject standard deviations of However, their performance in the intersession classification
the respective mean CA values. significantly deteriorated and in consequence, SVMs with
homogenous kernels, mainly - and - , were
found to be more robust. In the category, -
optimal and optimal . was excluded demonstrated the supreme effectiveness in both experiments
from this analysis since it introduced an interaction term and and thus in the majority of cases, was considered as the optimal
violated the assumption of sphericity [33]. representative. On average, and RFD proved to be the
Both factors, classifier type and feature type, were found to most reliable approaches in the overall two-session analysis.
significantly affect the results of Session I and Although the optimal yielded the highest CA rates
Session II experiments, i.e., the null hypothesis of in the CV on Session I, confirmed in the ANOVA framework,
equality was rejected. No other effects or significant interactions its deficient generalization capability in the session-to-session
between the factors were detected. Tukey’s honestly significant evaluation renders the approach, particularly with
difference criterion [33] was used to perform the factorial post inhomogeneous kernel, less practical in application to MI
test simultaneous comparison of feature sets and classifiers. In classification problems.
consequence, was found to deliver, on average, sig- As mentioned earlier, two objective functions were utilized in
nificantly higher CA rates than on Session I data (at the GA-based procedure for optimization of the SVMs’ param-
the significance level of ). The test results obtained eters. The combination of the CV error estimate and the theoret-
on Session II revealed that two PSD approaches, ical radius margin bound often led to better session-to-session
and , outperformed and results but at the cost of the performance on Session I. Still, the
. Analysis of the main effect of each classifier demonstrated highest CA rate on Session I data was the criterion for optimal
that the optimal provided significantly better average SVM selection among those produced by GA.
performance than LDA on Session I data. On the The analysis reveals a rather high variance in the CA rates
other hand, an analogous post-ANOVA multiple comparison of obtained across the subjects considered in this study. Its origins
classifiers tested on Session II indicated the inferiority of the appear to be multifactorial. In the first place, considerable differ-
optimal in relation to optimal . ences in the levels of experience that the study participants had
An analysis of the average classification performance ob- in undergoing BCI experiments, reflected in a varying degree of
tained in both sessions with spectral methods grouped in six their BCI skills, should be emphasized. It is also related to dif-
categories [Fig. 4(a) and (b)] leads to more general conclu- fering levels of subjects’ motivation and immersion in the BCI
sions. First, it reveals the overall robustness and consistency training paradigm observed before and after the recording ses-
of PSD approaches in generating the most distinguishable sets sions. Finally, it has been shown that the clarity and intensity of
of features from MI induced EEG trials. Moreover, high CA the MI induced ERD/ERS manifestations in EEG is highly sub-
rates produced with the approach in the evaluation on ject-dependent [34]. Individual neurophysiological features can
Session I data and their radical deterioration in the single-pass thus affect the user’s capability to operate a BCI system. Closer
test on Session II data deserve special attention. This contrast inspection of the t-f power spectral distributions of the MI in-
can be partly explained by excessively high dimensionality duced EEG trials in this study revealed that in some subjects
of the corresponding EEG feature vector the ERD/ERS correlates were found scarcely discernible. These
and low specificity of distinctive spectral patterns relevant to the observations corresponded to the level of the classification per-
intersession classification problem. It has also been observed formance obtained for the given individuals (cf. Table III).
that the AR feature distribution, which is highly sensitive to Although the feature extraction framework presented in this
noise in a raw signal, markedly varies from one session to the paper is not intrinsically causal, it can be modified to satisfy
other. Finally, the other four categories of spectral methods dis- the requirements of online applicability. To this end, a moving
played comparable performance with consistent trends among window method with continuous classification on a sample-by-
classifier types in each session. It is worth noting that , sample basis should be adopted. A current time point within a
HERMAN et al.: COMPARATIVE ANALYSIS OF SPECTRAL APPROACHES TO FEATURE EXTRACTION 325

trial should serve then as the end point of the feature extraction able for BCI. The combination of different EEG datasets with a
window. At the same time, this approach is associated with a range of motor imagery dynamics from beginning to end of trial
delayed response of the BCI system to the changes in the spec- obtained during MI-based BCI experiments, undertaken by sub-
tral content of the ongoing EEG activity. The window size of up jects with varying levels of prior experience and capabilities in
to 2 s, used in this work, allows for balancing the time and the using EEG-based communication devices, enabled the produc-
frequency resolution of analysis of the spectral phenomena rele- tion of more general conclusions at the cost of relatively lower
vant to MI related EEG pattern discrimination. In consequence, average performance and higher intersubject variance. The ex-
the time delay of the causal feature extractor mentioned above tensive complementary online analysis, within the framework of
is considered to be manageable and acceptable in online BCI causal feature extraction with a moving window approach and
operation. It should be emphasized that the proposed transfor- continuous classification, is intended for further work.
mation enforcing causality still requires time efficient imple-
mentation of the spectral techniques. Sample-by-sample clas- ACKNOWLEDGMENT
sification in online mode within the framework discussed here The authors would like to thank the reviewers and the Asso-
implies that the spectral quantification of an EEG trial within ciate Editor for their constructive comments and suggestions.
a moving window has to be performed faster than the sam-
pling frequency, . Preliminary simulations conducted in the REFERENCES
ISRC laboratory setup (with Pentium IV 3 GHz, 512 MB RAM) [1] , E. Niedermeyer and F. Lopes da Silva, Eds., Electroencephalography:
Basic Principles, Clinical Applications and Related Fields, 4th ed.
demonstrated the feasibility of applying the spectral methods, Baltimore, MD: Williams & Wilkins, 1998.
including quadratic energy distributions with their highest com- [2] , M. Akay, Ed., Time Frequency and Wavelets in Biomedical Signal
putational complexity, in online EEG-based BCI systems with Processing, ser. Biomedical Engineering. Piscataway, NJ: IEEE
Press, 1997.
real-time constraints ( for moving windows of [3] G. Pfurtscheller, C. Neuper, D. Flotzinger, and M. Pregenzer,
equal or less than 2 s duration, which corresponds to the max- “EEG-based discrimination between imagination of right and left
hand movement,” Electroencephalogr. Clin. Neurophysiol., vol. 103,
imum of 256 data points processed within 1/128 s). pp. 642–651, 1997.
[4] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T.
V. CONCLUSION AND FUTURE WORK M. Vaughan, “Brain-computer interfaces for communication and con-
trol,” Clin. Neurophysiol., vol. 113, pp. 767–791, 2002.
This paper has investigated various approaches to quantifying [5] J. R. Wolpaw, D. J. McFarland, G. W. Neat, and C. A. Forneris, “An
the relevant spectral content of the EEG utilized for MI classi- eeg-based brain-computer interface for cursor control,” Electroen-
cephalogr. Clin. Neurophysiol., vol. 78, pp. 252–259, 1991.
fication. The separability of the EEG feature sets representing [6] J. del R. Millan, M. Franze, J. Mourino, F. Cincotti, and F. Babiloni,
entire BCI trials has been evaluated offline. Special attention “Relevant EEG features for the classification of spontaneous motor-
has been paid to session-to-session generalization properties of related tasks,” Biol. Cybern., vol. 86, pp. 89–95, 2002.
[7] E. Yom-Tov and G. F. Inbar, “Feature selection for the classification
the classifiers and the feature sets. of movements from single movement-related potentials,” IEEE Trans.
A detailed ranking of the most effective spectral methods in Neural Syst. Rehab. Eng., vol. 10, no. 3, pp. 170–177, Sep. 2002.
[8] G. Pfurtscheller et al., “Current trends in graz brain-computer interface
terms of the resultant CA rates is rather subject-specific. There (BCI) research,” IEEE Trans. Rehab. Eng., vol. 8, no. 2, pp. 216–219,
are a few cases of techniques achieving good performance in Jun. 2000.
one subject and poor performance in the other one. Similarly, [9] B. H. Yang, G. Z. Yan, R. G. Yan, and T. Wu, “Adaptive subject-based
feature extraction in brain-computer interfaces using wavelet packet
high CA rates in one session do not necessarily correspond with best basis decomposition,” Med. Eng. Phys., vol. 29, no. 1, pp. 48–53,
satisfactory results in the other session. Close observation of the 2007.
trends across all eleven subjects has demonstrated the superior [10] S. Lemm, C. Schafer, and G. Curio, “BCI competition 2003—data set
III: Probabilistic modeling of sensorimotor  rhythms for classification
robustness of PSD approaches consistent with all the classifiers of imaginary hand movements,” IEEE Trans. Biomed. Eng., vol. 51, no.
used in one- and two-session classification tasks. The most in- 6, pp. 1077–1080, Jun. 2004.
[11] D. Garrett, D. A. Peterson, C. W. Anderson, and M. H. Thaut, “Com-
ferior session-to-session stability was produced by the parison of linear, nonlinear, and feature selection methods for EEG
approach. signal classification,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11,
With regard to classification methods, the study has revealed no. 2, pp. 141–144, Jun. 2003.
[12] B. Obermaier, C. Guger, C. Neuper, and G. Pfurtscheller, “Hidden
the superiority of SVMs with Gaussian kernel markov models for online classification of single trial EEG data,” Pat-
in Session I experiments for the majority of the feature types. tern Recognit. Lett., vol. 22, pp. 1299–1309, 2001.
Clearly poorer performance is observed only for the [13] I. Kalatzis et al., “Design and implementation of an SVM-based com-
puter classification system for discriminating depressive patients from
approach. In session-to-session experiments, the selection of healthy controls using the P600 component of ERP signals,” Comput.
an optimal classifier that robustly maintains good performance Methods Programs Biomed., vol. 75, pp. 11–22, 2004.
[14] C. Bishop, Neural Networks For Pattern Recognition. Oxford, U.K.:
from one session to the other one is more subject and feature Oxford Univ. Press, 1995.
type dependent. Generally, the consistent effectiveness of linear [15] K.-R. Muller, M. Krauledat, G. Dornhege, G. Curio, and B. Blankertz,
approaches investigated in this work, - , - , “Machine learning techniques for brain-computer interfaces,” Biomedi-
zinische Technik, vol. 49, no. 1, pp. 11–22, 2004.
and RFD in particular, is emphasized. [16] J. Shawe-Taylor and N. Cristianini, Kernel Methods For Pattern Anal-
Although the concept of discrete classification performed at ysis. Cambridge, U.K.: Cambridge Univ. Press, 2004.
[17] N. F. Ince, S. Arica, and A. Tewfik, “Classification of single trial motor
the end of each trial is less practical in real-world BCI applica- imagery EEG recordings with subject adapted non-dyadic arbitrary
tions than a continuous approach, this thorough study still gives time-frequency tilings,” J. Neural Eng., vol. 3, pp. 235–244, 2006.
a valuable insight into the separability of MI induced EEG pat- [18] C. Neuper, G. R. Muller, A. Kubler, N. Birbaumer, and G. Pfurtscheller,
“Clinical application of an eeg-based brain-computer interface: A case
terns, in the same spirit as in [9], [17], and [18]. It also offers study in a patient with severe motor impairment,” Clin. Neurophysiol.,
some informed guidance as to which techniques are more suit- vol. 114, pp. 399–409, 2003.
326 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

[19] E. Haselsteiner and G. Pfurtscheller, “Using time-dependent neural Girijesh Prasad (M’98–SM’07) received the
networks for EEG classification,” IEEE Trans. Neural Syst. Rehabil. B.Tech. degree in electrical engineering, the M.Tech.
Eng., vol. 8, no. 4, pp. 457–462, Dec. 2000. degree in computer science and technology, and
[20] P. Stoica and R. Moses, Introductions to Spectral Analysis. Engle- the Ph.D. degree from Queen’s University Belfast,
wood Cliffs, NJ: Prentice-Hall, 1997. Belfast, U.K.
[21] R. G. Stockwell, L. Mansinha, and R. P. Lowe, “Localization of the He is a Senior Lecturer within the Faculty of
complex spectrum: The S transform,” IEEE Trans. Signal Process., vol. Computing and Engineering in the University of
44, no. 4, pp. 998–1001, Apr. 1996. Ulster, Ulster, U.K. His research interests include
[22] F. Auger, P. Flandrin, P. Goncalves, and O. Lemoine, Time-Frequency self-organizing hybrid intelligent systems, neural
toolbox CNRS Research Group, Rice University, 1996. computation, fuzzy logic, evolutionary algorithms,
[23] B. Boashash, , S. Haykin, Ed., “Time-frequency signal analysis,” in predictive modeling, and control with applications in
Advances in Spectrum Analysis and Array Processing 1. Englewood industrial and biological systems including brain–computer interface (BCI) and
Cliffs, NJ: Prentice-Hall, 1991, pp. 418–517. robotic systems. He has published over 70 papers in journals and conference
[24] S. Haykin, Adaptive Filter Theory, 2nd ed. Englewood Cliffs, NJ: proceedings.
Prentice-Hall, 1991.
Dr. Prasad is a Chartered Engineer and a member of the IET.
[25] M. T. Mitchell, Mach Learn. New York: McGraw-Hill, 1997.
[26] K.-R. Muller, C. W. Anderson, and G. E. Birch, “Linear and nonlinear
methods for brain-computer interfaces,” IEEE Trans. Neural. Syst. Re-
habil. Eng., vol. 11, no. 2, pp. 165–169, Jun. 2003.
[27] V. N. Vapnik, The Nature of Statitical Learning Theory. New York: Thomas Martin McGinnity (M’82) received the
Springer-Verlag, 1995. first class honours degree in physics and the Ph.D.
[28] S. Canu, Y. Grandvalet, V. Guigue, and A. Rakotomamonjy, degree from the University of Durham, Durham,
Perception Systèmes et Information, INSA de Rouen, SVM and U.K.
Kernel methods Matlab toolbox France, 2005 [Online]. Available: He is Professor of Intelligent Systems Engineering
http://www.asi.insa-rouen.fr/~arakotom/toolbox/index.html within the Faculty of Computing and Engineering,
[29] J. A. K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, and the University of Ulster, Ulster, U.K. He has 28 years
J. Vandewalle, Least Squares Support Vector Machines. Singapore: experience in teaching and research in electronic and
World Scientific, 2002. computer engineering and is currently Director of the
[30] O. Chapelle, V. Vapnik, O. Bousqet, and S. Mukherjee, “Choosing mul- Intelligent Systems Research Centre. His current re-
tiple parameters for support vector machines,” Mach. Learn., vol. 46, search interests are focussed on computational intel-
no. 1, pp. 131–159, 2002. ligence, hardware and software implementations of biologically plausible ar-
[31] D. E. Goldberg , Genetic Algorithms For Search, Optimization, and tificial neural networks, brain computer interfacing, and intelligent systems in
Machine Learning. Reading, MA: Addison-Wesley, 1989. robotics.
[32] P. Shenoy, M. Krauledat, B. Blankertz, R. P. N. Rao, and K.-R. Müller, Prof. McGinnity is a Fellow of the IET and a Chartered Engineer.
“Towards adaptive classification for BCI,” J. Neural Eng., vol. 3, pp.
13–23, 2006.
[33] Y. Hochberg and A. C. Tamhane, Multiple Comparison Procedures.
New York: Wiley, 1987.
[34] Event-Related Dynamics of Brain Oscillations, C. Neuper and W. Damien Coyle (M’05) received the first class degree
Klimesch, Eds. Amsterdam, The Netherlands: Elsevier, 2006, vol. in computing and electronic engineering and the
159, Progress in Brain Research. Ph.D. degree in intelligent systems engineering from
the University of Ulster, Ulster, U.K., in 2002 and
2006, respectively.
He is currently a Lecturer and a member of the
Pawel Herman (M’04) received the M.Sc. Eng. de- Intelligent Systems Research Centre. His research
gree in applied computer science in medicine and en- interests include biosignal processing, bio-inspired
gineering from Wroclaw University of Technology, cognitive and adaptive systems, and brain–computer
Wroclaw, Poland, in 2002. He is currently working interface technology.
toward the Ph.D. degree in intelligent systems at the Dr. Coyle is a member of the IET and the Isaac
University of Ulster, Derry, U.K. Newton Institute for Mathematical Sciences. He is a member of the executive
He is a member of the Intelligent Systems Re- committees of the IEEE Computation Intelligence Society (CIS) and the IEEE
search Centre as a Research Associate in Brain- Engineering in Medicine and Biology Society (EMBS) and Vice Chair of the
Computer Interfaces. His research interests en- IEEE CIS GOLD subcommittee and the CIS Awards subcommittee.
compass biosignal analysis, machine learning,
computational intelligence methods with emphasis
on hybrid adaptive systems and type-2 fuzzy logic systems, and a range of
problems in cognitive science mainly in the context of neurofeedback treatment.

You might also like