Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ml_superb asr2 recipe#5866

Merged
ftshijt merged 15 commits intoespnet:masterfrom
Stanwang1210:bootcamp
Sep 6, 2024
Merged

ml_superb asr2 recipe#5866
ftshijt merged 15 commits intoespnet:masterfrom
Stanwang1210:bootcamp

Conversation

@Stanwang1210
Copy link
Contributor

What?

Add an asr2 recipe to egs2/ml_superb

Why?

I fix the run.sh and local/data.sh of egs2/interspeech2024_dsu_challenge to make it fit the format of ml_superb.

@codecov
Copy link

codecov bot commented Aug 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 51.79%. Comparing base (91420f5) to head (d35b241).
Report is 306 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5866      +/-   ##
==========================================
+ Coverage   47.84%   51.79%   +3.94%     
==========================================
  Files         501      818     +317     
  Lines       44796    75376   +30580     
==========================================
+ Hits        21432    39039   +17607     
- Misses      23364    36337   +12973     
Flag Coverage Δ
test_integration_espnet2 ?
test_python_espnet2 50.27% <ø> (?)
test_utils 20.66% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Stanwang1210
Copy link
Contributor Author

Stanwang1210 commented Aug 11, 2024

The error seems to be unrelated to my PR

=========================== short test summary info ============================
FAILED test/espnet2/text/test_phoneme_tokenizer.py::test_text2tokens[g2p_en] - LookupError: 

Can someone help me to fix this?

@sw005320
Copy link
Contributor

We're working on fixing it.

@ftshijt ftshijt added Recipe ASR Automatic speech recogntion labels Aug 15, 2024
@ftshijt ftshijt added this to the v.202405 milestone Aug 15, 2024
Copy link
Collaborator

@ftshijt ftshijt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks! The whole recipe seems to be good, especially the readme is very instructive. I left a few minor comments for fixes and software-level suggestions.

@@ -0,0 +1,93 @@
# Trained with A100 (40 GB) x 1 GPUs for Kmeans1K+nbpe5K. It takes 32 minutes per epoch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, since the two configs are very similar, we may simply keep one with the best performance

@@ -0,0 +1,145 @@
#!/usr/bin/env bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data preparation is identical to asr1, you may use symlink

@@ -0,0 +1,398 @@
import argparse
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for other files.

Comment on lines +8 to +11
# Process Pipeline
source /home/stan/miniconda3/envs/espnet_discrete/etc/profile.d/conda.sh
conda activate /home/stan/miniconda3/envs/espnet_discrete
export NCCL_P2P_DISABLE=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Process Pipeline
source /home/stan/miniconda3/envs/espnet_discrete/etc/profile.d/conda.sh
conda activate /home/stan/miniconda3/envs/espnet_discrete
export NCCL_P2P_DISABLE=1

We can safely remove the part with local setup.

Comment on lines +13 to +14
stage=15
stop_stage=15
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please initial them to stage=1 and stop_stage=100 for default setup

Comment on lines +27 to +28
inference_config=conf/decode_ctc0.3.yaml
asr_config=conf/tuning/train_discrete_asr_e_branchformer1_1gpu_lr5e-4_warmup5k.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering some or our convention, you may name your config to conf/decode.yaml and conf/train.yaml for the first setup.

@Stanwang1210
Copy link
Contributor Author

@ftshijt
I made some adjustments based on your suggestions.
Please help to check it.
Thank you

@ftshijt
Copy link
Collaborator

ftshijt commented Sep 6, 2024

Thanks for your contribution~

@ftshijt ftshijt merged commit 9d607c1 into espnet:master Sep 6, 2024
Shikhar-S pushed a commit to Shikhar-S/espnet that referenced this pull request Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR Automatic speech recogntion ESPnet2 README Recipe

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants