A code template for training DNN-based speech enhancement models.

A training code template is highly valuable for deep learning engineers as it can significantly enhance their work efficiency. Despite different programmers have varying coding styles, some are excellent while others may not be as good. My philosophy is to prioritize simplicity. In this context, I am sharing a practical organizational structure for training code files in speech enhancement (SE). The primary focus is on keeping it concise and intuitive rather than aiming for comprehensiveness.

🔥 News

[2025-3-31] Added a new branch named plus for better implementation. Please use this one directly.
[2024-5-28] Added a new branch named pro for better implementation.

File Specification

configs: Configuration files for training and infernce.
DNSMOS: Pre-trained DNSMOS checkpoints from Microsoft.
evaluation: Metric calculation scripts adapted from URGENT 2024 official scripts.
models: Model definitions.
prepare_datasets: Scripts for generating DNS3 training data.
dataloader.py: Dataset class for the dataloader.
distributed_utils.py: Distributed Data Parallel (DDP) training utils.
evaluate.py: Evaluation script based on scp files obtained by inference.
infer.py: Inference script.
loss_factory.py: Various useful loss functions in SE.
scheduler.py: Warmup scheduler definition.
train.py: Training script, surpporting both multiple-GPU and single-GPU conditions.

Usage

When starting a new SE project, you should follow these steps:

Modify dataloader.py to match your dataset;
Define your own model in models;
Modify the configs/cfg_train to match your training setup;
Select a loss function in loss_factory.py, or create a new one if needed;

Run train.py:

python train.py
python train.py -D 1
python train.py -C configs/cfg_train.yaml -D 1
python train.py -C configs/cfg_train.yaml -D 0,1,2,3

After training finished, specify your checkpoint and paths in configs/cfg_infer.yaml;
Run evaluate.py.

Note

The code is originally intended for Linux systems, and if you attempt to adapt it to the Windows platform, you may encounter certain issues:

Incompatibility of paths: The file paths used in Linux systems may not be compatible with the file paths in Windows.
Challenges in installing the pesq package: The process of installing the pesq package on Windows may not be straightforward and may require additional steps or configurations.

Thanks for starring if you find this repo useful.

Acknowledgement

This code template heavily borrows from the excellent Sheffield_Clarity_CEC1_Entry repository in many aspects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

A code template for training DNN-based speech enhancement models.

🔥 News

File Specification

Usage

Note

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
DNSMOS/DNSMOS		DNSMOS/DNSMOS
configs		configs
evaluation		evaluation
models		models
prepare_datasets		prepare_datasets
LICENSE		LICENSE
README.md		README.md
dataloader.py		dataloader.py
distributed_utils.py		distributed_utils.py
evaluate.py		evaluate.py
infer.py		infer.py
loss_factory.py		loss_factory.py
requirements.txt		requirements.txt
scheduler.py		scheduler.py
train.py		train.py

Uh oh!

License

Uh oh!

Xiaobin-Rong/SEtrain

Folders and files

Latest commit

History

Repository files navigation

A code template for training DNN-based speech enhancement models.

🔥 News

File Specification

Usage

Note

Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages