Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add new functions and fix some bugs in SE#5193

Merged
simpleoier merged 33 commits intoespnet:masterfrom
Emrys365:tse
Jun 7, 2023
Merged

Add new functions and fix some bugs in SE#5193
simpleoier merged 33 commits intoespnet:masterfrom
Emrys365:tse

Conversation

@Emrys365
Copy link
Collaborator

@Emrys365 Emrys365 commented May 31, 2023

This PR adds two main functions to ESPnet-SE (speech enhancement):

  • DNSMOS and PESQ evaluation
    • We can now evaluate the DNSMOS scores with espnet2/enh/layers/dnsmos.py.
    • We can also evaluate the PESQ score (be careful about the license).
  • utt2category support
    • Now the enh.sh pipeline can automatically add the category information (utt2category) of each sample to the mini-batch by setting --use_utt2category true, which will be later processed by the preprocessor (espnet2/train/preprocessor.py) to be converted into unique integers.
    • The chunk-based iterator also supports constructing mini-batches based on the category information, so that each mini-batch only contains samples of the same category.

In addition:

  • The bugs in the data preparation for recipes egs2/vctk_noisy/enh1 and egs2/vctk_noisyreverb/enh1 have been fixed.
  • The SE losses with a weight of 0 will now be calculated without gradient to reduce the memory footprint.
  • A new argument channel_reordering is added to EnhPreprocessor and TSEPreprocessor to support data augmentation for multi-channel signals.
  • A new argument --skip_stats_npz is added to espnet2/tasks/abs_task.py to allow skipping the preparation of *_stats.npz during the "Collecting Stats" stage . This is enabled by default for the SE task because these stats will not be used later and can save the computation.
  • A new learning rate scheduler has been added in espnet2/schedulers/warmup_reducelronplateau.py, which combines the WarmupLR and ReduceLROnPlateau schedulers.

@Emrys365 Emrys365 added ESPnet2 SE Speech enhancement labels May 31, 2023
@sw005320 sw005320 added this to the v.202307 milestone May 31, 2023
@sw005320 sw005320 requested a review from simpleoier May 31, 2023 12:03
Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts! I left some comments. I mainly concern about the utt2category and collect_stats parts.

--use_noise_ref # Whether or not to use noise signal as an additional reference
for training a denoising model (default="${use_noise_ref}")
--extra_wav_list # Extra list of scp files for wav formatting (default="${extra_wav_list}")
--use_utt2category # Whether or not to load the utt2category file for training (default="${use_utt2category}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

done
fi

# Add the category information at the end of the data path list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit redundant I think. Maybe we can do something in the sampler and preprocessor instead of here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that will require more changes in more scripts. The current design only needs modifications in enh.sh, espnet2/tasks/enh.py, and espnet2/train/preprocessor.py.

format="%(asctime)s (%(module)s:%(lineno)d) %(levelname)s: %(message)s",
)

if dnsmos:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good feature. Can you add pesq as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can add the code. Please review whether we will have the license issue.

The pesq python package seems to violate the PESQ license.

"You can download the model from https://github.com/microsoft/"
"DNS-Challenge/tree/master/DNSMOS/DNSMOS"
)
if not Path(dnsmos_args["p808_model"]).exists():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor suggestion, this dnsmos is very specific to p808 model. It would be better if this can be more flexible for future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there are other options for the DNS model. I just followed the script provided by the official repository https://github.com/microsoft/DNS-Challenge/blob/master/DNSMOS/dnsmos_local.py#LL23C47-L23C47

@codecov
Copy link

codecov bot commented Jun 2, 2023

Codecov Report

Merging #5193 (5bf2ed3) into master (d26489e) will decrease coverage by 0.57%.
The diff coverage is 54.64%.

@@            Coverage Diff             @@
##           master    #5193      +/-   ##
==========================================
- Coverage   75.00%   74.43%   -0.57%     
==========================================
  Files         630      642      +12     
  Lines       56821    57611     +790     
==========================================
+ Hits        42619    42885     +266     
- Misses      14202    14726     +524     
Flag Coverage Δ
test_integration_espnet1 66.28% <ø> (ø)
test_integration_espnet2 47.52% <46.09%> (-0.10%) ⬇️
test_python 65.15% <39.34%> (-0.53%) ⬇️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/enh/layers/dnsmos.py 0.00% <0.00%> (ø)
espnet2/bin/enh_scoring.py 68.86% <35.29%> (-15.04%) ⬇️
espnet2/train/preprocessor.py 33.23% <65.43%> (+4.07%) ⬆️
espnet2/schedulers/warmup_reducelronplateau.py 80.18% <80.18%> (ø)
espnet2/bin/enh_tse_inference.py 91.02% <100.00%> (ø)
espnet2/enh/espnet_model.py 85.64% <100.00%> (-1.02%) ⬇️
espnet2/enh/espnet_model_tse.py 79.06% <100.00%> (-1.58%) ⬇️
espnet2/iterators/chunk_iter_factory.py 91.66% <100.00%> (+9.48%) ⬆️
espnet2/tasks/abs_task.py 76.67% <100.00%> (+0.14%) ⬆️
espnet2/tasks/enh.py 97.45% <100.00%> (+0.01%) ⬆️
... and 1 more

... and 18 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sw005320
Copy link
Contributor

sw005320 commented Jun 4, 2023

Can you add some tests for skipping the statistics collection and some changes in preprocessor?
It's OK if it is difficult.

@Emrys365
Copy link
Collaborator Author

Emrys365 commented Jun 4, 2023

Can you add some tests for skipping the statistics collection and some changes in preprocessor? It's OK if it is difficult.

Sure. For the skipping the statistics collection part, I think it is automatically covered by the integration tests. I will try to add some tests for the preprocessor.

@mergify mergify bot added the CI Travis, Circle CI, etc label Jun 5, 2023
@sw005320
Copy link
Contributor

sw005320 commented Jun 5, 2023

See https://github.com/espnet/espnet/actions/runs/5182160044/jobs/9338608132?pr=5193#step:18:2471

This PR is good for my side.
So, if @Emrys365, @simpleoier, and CI are OK, please merge this PR.

Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
A minor suggestion, it would be better to add acknowledgement or reference to DNSMOS or PESQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bugfix CI Travis, Circle CI, etc Enhancement Enhancement ESPnet2 New Features SE Speech enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants