Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ESPnet-SPK: add SdSV 2021 recipe#5659

Merged
sw005320 merged 21 commits intoespnet:masterfrom
Alexgichamba:sdsv_2021
Feb 15, 2024
Merged

ESPnet-SPK: add SdSV 2021 recipe#5659
sw005320 merged 21 commits intoespnet:masterfrom
Alexgichamba:sdsv_2021

Conversation

@Alexgichamba
Copy link
Contributor

What?

Add a spk1 recipe on the SdSV 2021 challenge

Why?

Provide diverse recipes for ESPnet-SPK and contribute a new testing protocol for spk tasks based on the DeepMine sample dataset

See also

https://sdsvc.github.io/

@sw005320 sw005320 added Recipe SID Speaker identification/embedding labels Feb 13, 2024
@sw005320 sw005320 added this to the v.202405 milestone Feb 13, 2024
@sw005320
Copy link
Contributor

Thanks, @Alexgichamba!
Is it ready for review?

@Jungjee, can you review this PR?

@Alexgichamba
Copy link
Contributor Author

Thanks, @Alexgichamba! Is it ready for review?

@Jungjee, can you review this PR?

Hi @sw005320
I still need to upload the model to HF and update the link in the README

@Alexgichamba Alexgichamba marked this pull request as ready for review February 13, 2024 16:09
Copy link
Contributor

@Jungjee Jungjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few minor comments, lgtm in general.
Thanks for your effort Alex :-)

@@ -0,0 +1,44 @@
# generate_trial_file.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a random seed to make it reproducible?

Copy link
Contributor Author

@Alexgichamba Alexgichamba Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you kindly show me what part of this would be random/irreproducible? The trials are from all the combinations, not a sub sample.

wav_list = []

with open(input_file, "r") as f:
paths = f.read().splitlines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it different to paths = f.readlines()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, slightly. It removes the newline.

@codecov
Copy link

codecov bot commented Feb 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c0c801f) 76.12% compared to head (ea510cd) 76.12%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5659   +/-   ##
=======================================
  Coverage   76.12%   76.12%           
=======================================
  Files         744      744           
  Lines       69247    69247           
=======================================
  Hits        52716    52716           
  Misses      16531    16531           
Flag Coverage Δ
test_configuration_espnet2 ∅ <ø> (∅)
test_integration_espnet1 62.92% <ø> (ø)
test_integration_espnet2 49.00% <ø> (ø)
test_python_espnet1 18.36% <ø> (ø)
test_python_espnet2 52.65% <ø> (ø)
test_utils 22.15% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you convert flac to wav?
ESPnet can load the flac file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be better to have all the audio all as wav. Should I skip this conversion step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
It is redundant and eats the file space, but at the same time, this would be easy to check.
So, you can keep it.

Co-authored-by: Shinji Watanabe <[email protected]>
@sw005320 sw005320 merged commit 332fdc1 into espnet:master Feb 15, 2024
@sw005320
Copy link
Contributor

Thanks, @Alexgichamba!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ESPnet2 README Recipe SID Speaker identification/embedding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants