midi-velocity-colorizer

PyTorch implementation for filling MIDI velocities from given MIDI notes. The model is an U-Net image colorizor & trained on expert performances from the Piano-e-Competition (MAESTRO dataset). It can work on all instrumental MIDI, but best expressiveness on piano (will train with other instruments in future).

This repo provides supplementary materials for our paper: "Filling MIDI Velocity using U-Net Image Colorizer" in CMMR2025.

📝 Additional Materials

WandB workspace (public available, but need login wandB first)
WandB report
- This wandb report includes quantitative results, refer to Tables 2 & 3 in the paper.
- This wandb report includes hyperparameter search not detailed in the paper (to save spaces).

📁 Code Contents

train.py — model training entry point
evaluation.ipynb — reproduce Tables 2 & 3 quantitative results in our paper
interface.ipynb — colorize & visualize the midi; reproduce Figures 1–4 in our paper
results/ - directory for model, outputs and audio demos
- checkpoints.zip download and extract here to interface with our trained model
- subjective_test_audio.rar listening test samples generated from our colorized MIDI using PianoTeq 8 (see Section 5.2 of the paper)
compare/ — Flat model and Kim2023's Seq2Seq model.

0. Hyperparameter Setup

All training settings are defined in conf/config.yaml.

Some data filtering operations were implemented but not used in our experiments.
To reproduce our results, please refer to the training logs and parameter tracking in our WandB workspace.

1. Dataset Setup

Please download the following datasets and place them under the /dataset folder:

2. Environment Setup

Tested on:

Ubuntu 20.04 (CUDA 12.0)
Ubuntu 22.04 (CUDA 12.2)

Using Conda & Pre-built Env (recommended):

conda env create -f environment.yaml

Manual Build Your Env:

conda create --name velocity_pred python=3.11
conda install pytorch=2.2.2 torchvision=0.17.2 torchaudio=2.2.2 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install lightning -c conda-forge

3. WandB Integration (optional)

We use Weights & Biases for experiment tracking.

wandb login

If preferred, you can switch to TensorBoard by modifying train.py.

4. Train the Model (optional)

Edit training options in conf/config.yaml. Then run:

# Re-implemented ConvAE baseline
python train.py exp.test_dataset="MAESTRO" matrix.seg_time=10 ae.model="ConvAE" loss.type="BCELoss" loss.mask='element_wise' loss.weight='u_shape' loss.cosim=0.2 exp.save_k_ckpt=10

# Proposed U-Net with 2x2 attention window
python train.py exp.test_dataset="MAESTRO" matrix.seg_time=10 ae.model="UNet" ae.ablation="attn" ae.attn_window=2 loss.type="BCELoss" loss.mask='element_wise' loss.weight='u_shape' loss.cosim=0.2 exp.save_k_ckpt=10

5. Evaluate the Model (Tables 2 & 3)

Use evaluation.ipynb to reproduce our results. You can skip the training via download & unzip our pretrain model checkpoints.zip to results/checkpoints folder.

For other models' results in Tables 2&3:

Flat model: compare/Flat_model/flat_evaluation.ipynb
Kim2023 Seq2Seq model: compare/Kim2023_model/seq2seq_evalaution.ipynb

6. Interface & Demo (Figure 4)

Use interface.ipynb to fill the MIDI (without velocity), you will visualise & obtain the MIDI (w. velocity filled) accordingly. You can skip the training, directly use our pretrain model checkpoints.zip as Step 5. Following are MIDI example from [Human] and [ConvAE, UNet]'s results, find more example in our interface.ipynb, and audio demos in subjective_test_audio.rar.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
compare		compare
conf		conf
images		images
main		main
results		results
LICENSE		LICENSE
README.md		README.md
evaluation.ipynb		evaluation.ipynb
interface.ipynb		interface.ipynb
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

midi-velocity-colorizer

📝 Additional Materials

📁 Code Contents

0. Hyperparameter Setup

1. Dataset Setup

2. Environment Setup

Using Conda & Pre-built Env (recommended):

Manual Build Your Env:

3. WandB Integration (optional)

4. Train the Model (optional)

5. Evaluate the Model (Tables 2 & 3)

6. Interface & Demo (Figure 4)

License

About

Uh oh!

Languages

License

zhanh-he/midi-velocity-colorizer

Folders and files

Latest commit

History

Repository files navigation

midi-velocity-colorizer

📝 Additional Materials

📁 Code Contents

0. Hyperparameter Setup

1. Dataset Setup

2. Environment Setup

Using Conda & Pre-built Env (recommended):

Manual Build Your Env:

3. WandB Integration (optional)

4. Train the Model (optional)

5. Evaluate the Model (Tables 2 & 3)

6. Interface & Demo (Figure 4)

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages