Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Adding Hifitts recipe for espnet#5784

Merged
ftshijt merged 9 commits intoespnet:masterfrom
coding-phoenix-12:hifitts_recipe
May 29, 2024
Merged

Adding Hifitts recipe for espnet#5784
ftshijt merged 9 commits intoespnet:masterfrom
coding-phoenix-12:hifitts_recipe

Conversation

@coding-phoenix-12
Copy link
Contributor

Hi-Fi TTS Recipe for Espnet.

@sw005320 sw005320 added Recipe TTS Text-to-speech labels May 22, 2024
@sw005320 sw005320 added this to the v.202405 milestone May 22, 2024
@sw005320
Copy link
Contributor

@ftshijt, can you review this PR?

Copy link
Collaborator

@ftshijt ftshijt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also remember to add your corpus to egs2/README.md

@@ -0,0 +1,187 @@
# This configuration is for ESPnet2 to train 44.1 kHz
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only PR the config related to this recipe (usually start with a single best config would be sufficient)

Comment on lines +8 to +19
fs=24000
n_fft=2048
n_shift=300
win_length=1200

opts=
if [ "${fs}" -eq 48000 ]; then
# To suppress recreation, specify wav format
opts="--audio_format wav "
else
opts="--audio_format flac "
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that in the readme you said the recipe is for 22.05kHz setup, but the default fs is set to 24000 and 48000 (for raw waveform)? which seems to be a conflict

@coding-phoenix-12
Copy link
Contributor Author

I have made the changes mentioned. Please have a look.
Thank you

@codecov
Copy link

codecov bot commented May 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 28.15%. Comparing base (90eed8e) to head (31a7d99).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5784      +/-   ##
==========================================
+ Coverage   28.12%   28.15%   +0.02%     
==========================================
  Files         544      544              
  Lines       46233    46254      +21     
==========================================
+ Hits        13005    13023      +18     
- Misses      33228    33231       +3     
Flag Coverage Δ
test_integration_espnetez 28.15% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

win_length=null

opts=
if [ "${fs}" -eq 48000 ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ "${fs}" -eq 48000 ]; then
if [ "${fs}" -eq 44100 ]; then

According to the openslr page, the highest sampling rate is 44100 for suppressing wav recreation.

@ftshijt
Copy link
Collaborator

ftshijt commented May 28, 2024

LGTM! thanks for your contribution. Could you please also fix the CI errors? After that, this PR shall be ready for merge

@coding-phoenix-12
Copy link
Contributor Author

The CI checks on Ubuntu seem to be failing at egs2/asvspoof/spk1/run.sh . Is there something I must do to fix this?

@sw005320
Copy link
Contributor

@ftshijt
Copy link
Collaborator

ftshijt commented May 29, 2024

Thanks for your contribution! The current PR looks good to me~

@ftshijt ftshijt merged commit 9b31da2 into espnet:master May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants