Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Yf-Hma/brain_decode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BrainDEC: A Multimodal LLM for the Non-Invasive Decoding of Text from Brain Recordings

model

Requirements

Experiments

This repository contains three experiments associated to three different tasks and datasets:

  • Spoken text decoding (convers): Multimodal spoken text decoding during conversations (main task of this work).
  • Perceived speech decoding (perceived): Decoding the textual content of listened stories.
  • Brain captioning (NSD): Decoding the captions of viewed images using the NSD datasets.
  • Decoding reading text from EEG signals.

In the following, we detail the steps to reproduce the results of each experiment presented in the paper.

1. Spoken Text Decoding

Configuration
  • Update the configuration file by specifying the following paths: DATA_PATH (ex. data/convers), RAW_FMRI_DATA_PATH (ex. data/fmri_convers), MODELS_TRAIN_PATH (ex. trained_models/convers), LLM_PATH (ex. LLMs/Meta-llama3.2-8b-Instruct)
  • Download the the Convers datasets version 2.2.0 from the OpenNeuro platform ds001740 inside RAW_FMRI_DATA_PATH specified in the config file.
  • Create a folder named "raw_data/transcriptions" inside DATA_PATH and upload the raw Transcriptions from the Ortolang platform convers/v2 into it:

With DATA_PATH set to "data/convers", you should obtain a structure similar to this after data preprocessing:

data
└── convers
    ├── preprocessed_fmri_data
    │   └── fMRI_data_200
    ├── processed_data
    │   ├── fMRI_data_split
    │   ├── interlocutor_text_data
    │   └── participant_text_data
    ├── raw_data
    │   ├── transcriptions
    │   └── fmri
    ├── test.json
    └── train.json
Preprocessing and evaluation
# Preprocessing raw data
python exps/convers/process_raw_bold_signal.py --n_rois 200 # Parcellation using 200 ROIs
python exps/convers/data_builder_tools/split_bold_files.py  # Processing raw 4D voxel BOLD signals and segmenting them into fixed-duration chunks
python exps/convers/data_builder_tools/textgrid_to_text.py # Processing transcription files (conversations) and segmenting them into fixed-duration text sequences

# Building training and test data
python exps/convers/data_builder_tools/build_data.py # Using json files to save paths of bold chunks and the [input, output] text for instruction tuning
python exps/convers/data_builder_tools/build_tokenizer.py # Building the tokenizer for the first stage of training

# Training and testing after each save_epoch
python  exps/convers/train_stage1.py -m DeconvBipartiteTransformerConv --batch_size 128 --epochs 200 # Stage-1: training the DeconvBipartite Transformer
python  exps/convers/train_stage2.py --batch_size 32 --epochs 100  -m BrainDEC_V0  --save_epochs 50 # Stage-2. Note: BrainDEC_V1 or BrainDEC_V2 converge quickly than V0, only 20 epochs are needed.

# Evaluate the results of the test set and save the scores
python exps/convers/evaluation.py   

2. Perceived Speech Decoding

Configuration
  • Update the configuration file by specifying the following paths: RAW_FMRI_DATA_PATH (ex. data/perceived), MODELS_TRAIN_PATH (ex. trained_models/perceived), and LLM_PATH (ex. LLMs/Meta-llama3.2-8b-Instruct).
  • In the folders "DATA_TRAIN_DIR" and "DATA_TEST_DIR" (see the config file), download the training and test datasets as outlined in this project semantic-decoding.

With DATA_PATH set to "data/perceived" for example, you should obtain a structure similar to this after data preprocessing:

data
└── perceived
    ├── data_test
    ├── data_train
    └── processed
        ├── S1
        ├── S2
        ├── S3
        ├── fMRI_data_test_split
        └── fMRI_data_train_split
Preprocessing and evaluation
# Data preparation
python exps/perceived/prepare_datasets.py -s $subject (for $subject  in ['S1', 'S2', 'S3'])

# Build tokenizer for stage 1
python exps/perceived/build_tokenizer.py

# Stage-1 training (in a cross-subject manner)
python exps/perceived/train_stage1.py --batch_size 128

# Stage-2 training (for each subject separately)
python exps/perceived/train_stage2.py --batch_size 32 -s $subject (for $subject  in ['S1', 'S2', 'S3'])

# Evaluate the results of the test set and save the scores
python exps/perceived/evaluation.py $subject (for $subject  in ['S1', 'S2', 'S3'])

3. Brain Captioning - BrainHub benchmark on NSD dataset

This is a comparison with brain understanding benchmark (BrainHub), based on Natural Scenes Dataset NSD and COCO.

Configuration
  • The processed datasets are available in here.
  • Download the datasets using this script.
  • Download COCO annotations from this link in the folder 'tools'
  • Update the configuration file to specify the paths, and eventually to modify the hyperparameters.
data
└── nsd
    ├── webdataset_avg_split
    │   ├── test
    │   ├── train
    └── └── val
Traning
  • To train and evaluate the model:
# Stage-1 training (in a cross-subject manner)
python exps/zuco/build_tokenizer.py
python exps/nsd/train_stage1.py --batch_size 128

# Stage-2 training (for each subject separately)
python exps/nsd/train_stage2.py --epochs 6 --save_epochs 1 --batch_size 32 -s $subject (choices=[1, 2, 5, 7])

To get the evaluation scores for each subject based on the generated files of the test set, refer to the Benchmark project.

With DATA_PATH set to "data/nsd", you should obtain the following structure:

Results

The results presented here improve upon those reported in the paper by (1) training the first stage in a cross-subject manner, (2) using curated COCO annotations (COCO_73k_annots_curated.npy) and (3) adjusting the decoder LLM’s inference hyperparameters (see the configuration file). Results may vary slightly due to initialization and non-deterministic algorithms, but the variation remains low. Reported BrainDEC values are averaged over three runs. We compare our model with existing methods from the BrainHub benchmark:

Method Eval BLEU1 BLEU4 METEOR ROUGE CIDEr SPICE CLIPS RefCLIPS
UMBRAE S1 59.44 19.03 19.45 43.71 61.06 12.79 67.78 73.54
UMBRAE-S1 S1 57.63 16.76 18.41 42.15 51.93 11.83 66.44 72.12
BrainDEC S1 61.29 19.68 17.99 44.47 53.82 10.67 63.09 69.60
BrainCap S1 55.96 14.51 16.68 40.69 41.30 9.06 64.31 69.90
OneLLM S1 47.04 9.51 13.55 35.05 22.99 6.26 54.80 61.28
SDRecon S1 36.21 3.43 10.03 25.13 13.83 5.02 61.07 66.36
Method Eval BLEU1 BLEU4 METEOR ROUGE CIDEr SPICE CLIPS RefCLIPS
UMBRAE S2 59.37 18.41 19.17 43.86 55.93 12.08 66.46 72.36
UMBRAE-S2 S2 57.18 17.18 18.11 41.85 50.62 11.50 64.87 71.06
BrainDEC S2 59.28 17.99 17.75 43.60 51.53 9.88 62.86 69.27
BrainCap S2 53.80 13.03 15.90 39.96 35.60 8.47 62.48 68.19
SDRecon S2 34.71 3.02 9.60 24.22 13.38 4.58 59.52 65.30
Method Eval BLEU1 BLEU4 METEOR ROUGE CIDEr SPICE CLIPS RefCLIPS
UMBRAE S5 60.36 19.03 20.04 44.81 61.32 13.19 68.39 74.11
UMBRAE-S5 S5 58.99 18.73 19.04 43.30 57.09 12.70 66.48 72.69
BrainDEC S5 61.82 19.57 18.70 44.63 57.65 11.32 64.03 70.26
BrainCap S5 55.28 14.62 16.45 40.87 41.05 9.24 63.89 69.64
SDRecon S5 34.96 3.49 9.93 24.77 13.85 5.19 60.83 66.30
Method Eval BLEU1 BLEU4 METEOR ROUGE CIDEr SPICE CLIPS RefCLIPS
UMBRAE S7 57.20 17.13 18.29 42.16 52.73 11.63 65.90 71.83
UMBRAE-S7 S7 55.71 15.75 17.51 40.64 47.07 11.26 63.66 70.09
BrainDEC S7 59.07 17.97 17.48 43.07 49.22 9.90 61.52 68.06
BrainCap S7 54.25 14.00 15.94 40.02 37.49 8.57 62.52 68.48
SDRecon S7 34.99 3.26 9.54 24.33 13.01 4.74 58.68 64.59

4. Decoding Reading Text from EEG Signals

Configuration and data preparation

The same raw data and preprocessing presented in EEG-To-Text are employed here.

  • Update the configuration files "configs/configs_zuco.py" by specifying the paths similarly to the previous experiments.
  • Download the following folders from ZuCo v1.0 and place them in the DATA_PATH specified in the config file (e.g., data/zuco/task1-SR/Matlab_files, etc.).
  • Download task1-NR/Matlab_files from ZuCo v2.0 and place it as task2-NR-2.0/Matlab_files inside DATA_PATH.
  • Generate the preprocessed data using the following instructions:

With DATA_PATH set to data/zuco, for example, you should obtain the following structure after data preprocessing:

data
└── zuco
    ├── processed
    │   ├── task1-SR
    │   ├── task2-NR
    │   ├── task2-NR-2.0
    │   └── task3-TSR
    ├── task1-SR
    │   └── Matlab_files
    ├── task2-NR
    │   └── Matlab_files
    ├── task2-NR-2.0
    │   └── Matlab_files
    └── task3-TSR
        └── Matlab_files
Preprocessing and evaluation
# Data preparation
python exps/zuco/preprocess_data.py -t task1-SR
python exps/zuco/preprocess_data.py -t task2-NR
python exps/zuco/preprocess_data.py -t task3-TSR
python exps/zuco/preprocess_data_v2.py

# Build tokenizer for stage-1
python exps/zuco/build_tokenizer.py

# Training and evaluation
python exps/zuco/train_stage1.py --batch_size 128 --epochs 20
python exps/zuco/train_stage2.py --batch_size 16 --epochs 4
python exps/zuco/evaluation.py

TODO

  • Apply the proposed methodology for NSD datasets.
  • Test other LLM decoders.
  • Add experiments for decoding text from EEG signals.
  • Cross-subject training for NSD dataset.

Notes

  • The structure of this repository is in work progress

  • Some parts of the code of this project are adapted from InstructBlip, we thank the authors for their great work.

  • In the comparison on perceived speech decoding, we used the same datasets and configuration setup in this article. Data preprocessing and preparation scripts are taken from this link. We thank the authors for their great work.

Citation

@article{hmamouche2026braindec103589,
title = {BrainDEC: A Multimodal LLM for the Non-Invasive Decoding of Text from Brain Recordings},
journal = {Information Fusion},
volume = {127},
pages = {103589},
year = {2026},
issn = {1566-2535},
doi = {https://doi.org/10.1016/j.inffus.2025.103589},
url = {https://www.sciencedirect.com/science/article/pii/S156625352500661X},
author = {Youssef Hmamouche and Ismail Chihab and Lahoucine Kdouri and Amal El Fallah Seghrouchni}
}

About

BrainDEC: A Multimodal LLM for the Non-Invasive Decoding of Text from Brain Recording

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages