Conversation
|
Please fix the CI issue |
|
Could you add some unit tests under
|
|
It would be great if you also share the trained model on HuggingFace. You could check this link for more information on how to upload the model: https://github.com/espnet/espnet/blob/master/CONTRIBUTING.md#132-espnet2-recipes |
|
Any update, @zqwang7? |
In the most recent commits, "fix isort" changes from espnet2.enh.loss.criterions.tf_domain import FrequencyDomainL1, FrequencyDomainMSE to from espnet2.enh.loss.criterions.tf_domain import (FrequencyDomainL1, But then, there is a code format error, and I need to run black (run black), and black changes from espnet2.enh.loss.criterions.tf_domain import (FrequencyDomainL1, to from espnet2.enh.loss.criterions.tf_domain import FrequencyDomainL1, FrequencyDomainMSE I have no ideas how to deal with this.... |
Would changing from espnet2.enh.loss.criterions.tf_domain import FrequencyDomainL1, FrequencyDomainMSE to from espnet2.enh.loss.criterions.tf_domain import FrequencyDomainL1 solve the problem? |
|
It seems a version issue. Could you update both packages to the latest version and retry? |
Co-authored-by: Wangyou Zhang <[email protected]>
Co-authored-by: Wangyou Zhang <[email protected]>
…gridnet git stash and then git pull origin_ZQ tfgridnet
|
See https://github.com/espnet/espnet/actions/runs/4023093431/jobs/6913553984#step:8:9974 |
Codecov Report
@@ Coverage Diff @@
## master #4864 +/- ##
==========================================
+ Coverage 76.56% 76.63% +0.07%
==========================================
Files 603 604 +1
Lines 53738 53934 +196
==========================================
+ Hits 41142 41334 +192
- Misses 12596 12600 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Code of TF-GridNet proposed in
[1] Z.-Q. Wang, S. Cornell, S. Choi, Y. Lee, B.-Y. Kim, and S. Watanabe,
"TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation",
in arXiv preprint arXiv:2211.12433, 2022.
[2] Z.-Q. Wang, S. Cornell, S. Choi, Y. Lee, B.-Y. Kim, and S. Watanabe,
"TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation",
in arXiv preprint arXiv:2209.03952, 2022.
The SI-SDR result of using D=48, I=4, J=1, H=192 (and 16 ms window and 8 ms hop sizes) obtains 22.8 dB SI-SNR on WSJ0-2mix. It is close to the 23.2 dB (obtained using 32 ms window and 8 ms hop sizes) reported in the jounral submission. We think the result is reasonably close, considering that, when using 4 s chunk_length and the default SI-SNR loss for training, the current chunking mechanism in ESPNet-SE discards ~15% training examples and some trailing segments (each of which can be up to 2-second long).