Thanks to visit codestin.com
Credit goes to github.com

Skip to content
forked from sp-uhh/sgmse

COnditional flow-matching for Speech Enhancement and Dereverberation

License

Mattias421/cfmse

 
 

Repository files navigation

CFMSE

This repository contains code for running experiments for: Flowing Straighter with Conditional Flow Matching for Accurate Speech Enhancement

To train an ICFM model with flow-matching loss, do --backbone ncsnpp_v2 --sde icfm --sigma 0.1 --loss_type flow_matching For the SB-SV, we build off the SB-VE, use the c parameter to set sigma e.g. --backbone ncsnpp_v2 --sde sbve --loss_type data_prediction --variance_type stationary --c 0.1 For our novel one-step sampler, set --sampler_type dp To use the xps/eval.sh script, make sure WhiSQA and DNSMOS are set up within the parent directory. Other scripts in the xps folder show examples of the settings used and how to run slurm job arrays. Audio samples can be heard here

Installation

  • Create a new virtual environment with Python 3.11 (we have not tested other Python versions, but they may work).
  • Install the package dependencies via pip install -r requirements.txt.
    • Let pip resolve the dependencies for you. If you encounter any issues, please check requirements_version.txt for the exact versions we used.
  • If using W&B logging (default):
    • Set up a wandb.ai account
    • Log in via wandb login before running our code.
  • If not using W&B logging:
    • Pass the option --nolog to train.py.
    • Your logs will be stored as local CSVLogger logs in lightning_logs/.

Training

Training is done by executing train.py. A minimal running example with default settings can be run with

python train.py --base_dir <your_base_dir>

where your_base_dir should be a path to a folder containing subdirectories train/ and valid/ (optionally test/ as well). Each subdirectory must itself have two subdirectories clean/ and noisy/, with the same filenames present in both. We currently only support training with .wav files.

To see all available training options, run python train.py --help. Note that the available options for the SDE and the backbone network change depending on which SDE and backbone you use. These can be set through the --sde and --backbone options.

Evaluation

To evaluate on a test set, run

python enhancement.py --test_dir <your_test_dir> --enhanced_dir <your_enhanced_dir> --ckpt <path_to_model_checkpoint>

to generate the enhanced .wav files, and subsequently run

python calc_metrics.py --test_dir <your_test_dir> --enhanced_dir <your_enhanced_dir>

to calculate and output the instrumental metrics.

Both scripts should receive the same --test_dir and --enhanced_dir parameters. The --cpkt parameter of enhancement.py should be the path to a trained model checkpoint, as stored by the logger in logs/.

Citations / References

This work (code, readme, experimental setup) is based off SGMSE by sp-uhh, please check them out!

To cite this work, please use the following bibtex

@InProceedings{cross2025cfmse,
  title = 	 {Flowing Straighter with Conditional Flow Matching for Accurate Speech Enhancement},
  author =       {Cross, Mattias and Ragni, Anton},
  booktitle = 	 {Proceedings of the 2nd ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications"},
  pages = 	 {121--132},
  year = 	 {2025},
  editor = 	 {Coelho, Cecı́lia and Zimmering, Bernd and Costa, M. Fernanda P. and Ferrás, Luı́s L. and Niggemann, Oliver},
  volume = 	 {277},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26 Oct},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v277/main/assets/cross25a/cross25a.pdf},
  url = 	 {https://proceedings.mlr.press/v277/cross25a.html},
}

About

COnditional flow-matching for Speech Enhancement and Dereverberation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.5%
  • Shell 5.2%
  • Cuda 4.7%
  • C++ 0.6%