Thanks to visit codestin.com
Credit goes to github.com

Skip to content

An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation

Notifications You must be signed in to change notification settings

mcjpedro/speech_decoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation

João Pedro Carvalho Moreira 1,#, Vinícius Rezende Carvalho 1, Eduardo Mazoni Andrade Marçal Mendes 1, Ariah Fallah 2, Terrence J. Sejnowski 3,4,5, Claudia Lainscsek 3,4, Lindy Comstock 2,6,*,#

1 Postgraduate Program in Electrical Engineering, Federal University of Minas Gerais, Belo Horizonte, MG 31270-901, Brazil
2 Department of Neurosurgery, University of California, Los Angeles, Los Angeles, CA 90095, USA
3 Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
4 Institute for Neural Computation University of California San Diego, La Jolla, CA 92093, USA
5 Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
6 Department of Linguistics, National Research University Higher School of Economics, Moscow 101000, RF
* Corresponding author(s): Lindy Comstock ([email protected])
# These authors contributed equally to this work

ABSTRACT

Electroencephalography (EEG) holds promise for brain-computer interface (BCI) devices as a non-invasive measure of neural activity. With increased attention to EEG-based BCI systems, publicly available datasets that can represent the complex tasks required for naturalistic speech decoding are necessary to establish a common standard of performance within the BCI community. Effective solutions must overcome various kinds of noise in the EEG signal and remain reliable across sessions and subjects without overfitting to a specific dataset or task. We present two validated datasets (N=8 and N=16) for classification at the phoneme and word level and by the articulatory properties of phonemes. EEG signals were recorded from 64 channels while subjects listened to and repeated six consonants and five vowels. Individual phonemes were combined in different phonetic environments to produce coarticulated variation in forty consonant-vowel pairs, twenty real words, and twenty pseudowords. Phoneme pairs and words were presented during a control condition and during transcranial magnetic stimulation targeted to inhibit or augment the EEG signal associated with specific articulatory processes.

CODE AVAILABILITY

The data and codes used in this work are available at OSF to allow reproducibility and sharing of information under the CC BY 4.0 license (http://creativecommons.org/licenses/by-nc-nd/4.0/). The routines can be found in the Study/EEG_Data_Processing/Code folder. These routines are responsible for the analyses presented in the technical validation section. The results obtained for both signal processing techniques, as discussed in Data Processing section, are placed in the same folder. The same code is available on GithHub so as to allow for version control and discussion of the implementation and analysis carried out in this work. The routines were built to obtain the ERP using only ICA and signal cleaning was performed using the pipeline described in Figure 1, based on the EEGLab library versions 2022.0 and 2022.1 native to MATLAB.

alt text Figure 1 - Code structure to data processing.

About

An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages