Add new functions and fix some bugs in SE#5193
Conversation
…cement related tasks
for more information, see https://pre-commit.ci
simpleoier
left a comment
There was a problem hiding this comment.
Thanks for the efforts! I left some comments. I mainly concern about the utt2category and collect_stats parts.
egs2/TEMPLATE/enh1/enh.sh
Outdated
| --use_noise_ref # Whether or not to use noise signal as an additional reference | ||
| for training a denoising model (default="${use_noise_ref}") | ||
| --extra_wav_list # Extra list of scp files for wav formatting (default="${extra_wav_list}") | ||
| --use_utt2category # Whether or not to load the utt2category file for training (default="${use_utt2category}") |
| done | ||
| fi | ||
|
|
||
| # Add the category information at the end of the data path list |
There was a problem hiding this comment.
It is a bit redundant I think. Maybe we can do something in the sampler and preprocessor instead of here.
There was a problem hiding this comment.
I think that will require more changes in more scripts. The current design only needs modifications in enh.sh, espnet2/tasks/enh.py, and espnet2/train/preprocessor.py.
espnet2/bin/enh_scoring.py
Outdated
| format="%(asctime)s (%(module)s:%(lineno)d) %(levelname)s: %(message)s", | ||
| ) | ||
|
|
||
| if dnsmos: |
There was a problem hiding this comment.
This is a good feature. Can you add pesq as well?
There was a problem hiding this comment.
Yes, I can add the code. Please review whether we will have the license issue.
The
pesqpython package seems to violate the PESQ license.
| "You can download the model from https://github.com/microsoft/" | ||
| "DNS-Challenge/tree/master/DNSMOS/DNSMOS" | ||
| ) | ||
| if not Path(dnsmos_args["p808_model"]).exists(): |
There was a problem hiding this comment.
A minor suggestion, this dnsmos is very specific to p808 model. It would be better if this can be more flexible for future.
There was a problem hiding this comment.
I don't know if there are other options for the DNS model. I just followed the script provided by the official repository https://github.com/microsoft/DNS-Challenge/blob/master/DNSMOS/dnsmos_local.py#LL23C47-L23C47
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #5193 +/- ##
==========================================
- Coverage 75.00% 74.43% -0.57%
==========================================
Files 630 642 +12
Lines 56821 57611 +790
==========================================
+ Hits 42619 42885 +266
- Misses 14202 14726 +524
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 18 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
Can you add some tests for skipping the statistics collection and some changes in preprocessor? |
Sure. For the |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
See https://github.com/espnet/espnet/actions/runs/5182160044/jobs/9338608132?pr=5193#step:18:2471 This PR is good for my side. |
simpleoier
left a comment
There was a problem hiding this comment.
LGTM!
A minor suggestion, it would be better to add acknowledgement or reference to DNSMOS or PESQ.
This PR adds two main functions to ESPnet-SE (speech enhancement):
espnet2/enh/layers/dnsmos.py.enh.shpipeline can automatically add the category information (utt2category) of each sample to the mini-batch by setting--use_utt2category true, which will be later processed by the preprocessor (espnet2/train/preprocessor.py) to be converted into unique integers.In addition:
egs2/vctk_noisy/enh1andegs2/vctk_noisyreverb/enh1have been fixed.channel_reorderingis added to EnhPreprocessor and TSEPreprocessor to support data augmentation for multi-channel signals.--skip_stats_npzis added toespnet2/tasks/abs_task.pyto allow skipping the preparation of*_stats.npzduring the "Collecting Stats" stage . This is enabled by default for the SE task because these stats will not be used later and can save the computation.espnet2/schedulers/warmup_reducelronplateau.py, which combines the WarmupLR and ReduceLROnPlateau schedulers.