Thanks to visit codestin.com
Credit goes to github.com

Skip to content

SE function updates: new models and support for handling various sampling frequencies#5800

Merged
mergify[bot] merged 14 commits intoespnet:masterfrom
Emrys365:urgent_recipe
Jun 13, 2024
Merged

SE function updates: new models and support for handling various sampling frequencies#5800
mergify[bot] merged 14 commits intoespnet:masterfrom
Emrys365:urgent_recipe

Conversation

@Emrys365
Copy link
Collaborator

@Emrys365 Emrys365 commented Jun 5, 2024

What?

This PR updates the speech enhancement functions from the following aspects:

  • Added three new models: BSRNN, TF-GridNetV3 (modified based on V2 to support various sampling rates, not officially named), and mapping-based TCN
  • Improved espnet2/bin/enh_inference.py and espnet2/bin/enh_scoring.py to support various sampling rates
  • Improved espnet2/iterators/chunk_iter_factory.py to support
    • keeping short samples (for avoiding wasting many short training samples)
    • and truncating samples to a max length (for avoiding OOM) when handling different sampling rates
  • Added a new augmentation type bandwidth_limitation in espnet2/layers/augmentation.py
  • Added a new argument always_forward_in_48k in espnet2/enh/espnet_model.py to support handling different sampling rates in a single SE model
  • Improved l1_timedomain+magspec_loss in espnet2/enh/loss/criterions/time_domain.py to be more numerically stable.

Why?

These updates provide important functions for the forthcoming URGENT Challenge.

See also

…mprove espnet2/bin/enh_inference.py and espnet2/bin/enh_scoring.py to support various sampling rates; improve ChunkIterator to support keeping short samples and truncating samples to a max length; Add bandwidth_limitation in espnet2/layers/augmentation.py; Update espnet2/enh/espnet_model.py to support handling different sampling rates in one model
@Emrys365 Emrys365 added ESPnet2 SE Speech enhancement labels Jun 5, 2024
@mergify mergify bot added the CI Travis, Circle CI, etc label Jun 5, 2024
@sw005320
Copy link
Contributor

sw005320 commented Jun 7, 2024

Cool!

@LiChenda and @kohei0209, can you review this PR?

@LiChenda
Copy link
Contributor

LiChenda commented Jun 8, 2024

Cool!

@LiChenda and @kohei0209, can you review this PR?

Sure!

@codecov
Copy link

codecov bot commented Jun 8, 2024

Codecov Report

Attention: Patch coverage is 81.69935% with 84 lines in your changes missing coverage. Please review.

Project coverage is 52.59%. Comparing base (293c1cc) to head (e707989).

Files Patch % Lines
espnet2/enh/espnet_model.py 38.46% 16 Missing ⚠️
espnet2/enh/separator/tfgridnetv3_separator.py 91.53% 16 Missing ⚠️
espnet2/layers/augmentation.py 25.00% 15 Missing ⚠️
espnet2/bin/enh_inference.py 55.55% 12 Missing ⚠️
espnet2/bin/enh_scoring.py 9.09% 10 Missing ⚠️
espnet2/iterators/chunk_iter_factory.py 70.37% 8 Missing ⚠️
espnet2/enh/layers/bsrnn.py 98.21% 2 Missing ⚠️
espnet2/train/preprocessor.py 33.33% 2 Missing ⚠️
espnet2/enh/loss/criterions/time_domain.py 66.66% 1 Missing ⚠️
espnet2/enh/separator/bsrnn_separator.py 96.29% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5800      +/-   ##
==========================================
- Coverage   54.59%   52.59%   -2.01%     
==========================================
  Files         771      775       +4     
  Lines       70732    71155     +423     
==========================================
- Hits        38616    37422    -1194     
- Misses      32116    33733    +1617     
Flag Coverage Δ
test_python_espnet2 52.59% <81.69%> (-0.39%) ⬇️
test_utils ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sw005320 sw005320 added this to the v.202405 milestone Jun 10, 2024
Copy link
Contributor

@LiChenda LiChenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Emrys365 , I updated some of my comments.

ref_ = ref[i]
inf_ = inf[int(perm[i])]
elif sample_rate > 16000:
mode = "wb"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better add some log or warning here.

Copy link
Collaborator Author

@Emrys365 Emrys365 Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been added in the lines below.

Args:
fs (int): new sampling rate
"""
assert fs % self.default_fs == 0 or self.default_fs % fs == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this assertion was removed?

Copy link
Collaborator Author

@Emrys365 Emrys365 Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to allow the process of speech with a sampling rate that is fractional of the encoder's default sampling rate. It doesn't have to be exactly 1/n or a multiple, so I removed this line.

Args:
fs (int): new sampling rate
""" # noqa: H405
assert fs % self.default_fs == 0 or self.default_fs % fs == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this assertion was removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to allow the process of speech with a sampling rate that is fractional of the encoder's default sampling rate. It doesn't have to be exactly 1/n or a multiple, so I removed this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the consideration of adding a new tcn_separator2.py instead of updating functions in tcn_separator.py ? Is it necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have merged them!

@sw005320 sw005320 added auto-merge Enable auto-merge labels Jun 13, 2024
@sw005320
Copy link
Contributor

OK for me.
Once CI is passed, I'll merge this PR.

@sw005320
Copy link
Contributor

@Emrys365, we have an issue in the CI https://github.com/espnet/espnet/actions/runs/9500003127/job/26182089849?pr=5800
Please fix it.

@Emrys365
Copy link
Collaborator Author

OK! It should be fixed now.

@kohei0209
Copy link
Contributor

@Emrys365 Sorry for the late response. Codes look good to me!

@mergify mergify bot merged commit 63c4c09 into espnet:master Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge Enable auto-merge CI Travis, Circle CI, etc ESPnet2 SE Speech enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants