S2T Recipe for IPAPack++: main recipe#6168
Merged
sw005320 merged 13 commits intoespnet:masterfrom Jun 30, 2025
Merged
Conversation
for more information, see https://pre-commit.ci
Contributor
There was a problem hiding this comment.
Pull Request Overview
This pull request adds a new S2T recipe for the IPAPack++ dataset and integrates it into the existing ESPnet2 framework while also enhancing related training and configuration options.
- New recipe directory with dataset-specific utilities, scripts, and configurations
- Integration of IPAPack++ into dataset registration and ASR database
- Enhancement of BPE training support by introducing the --bpe_largecorpus option
Reviewed Changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| egs2/ipapack_plus/s2t1/utils | Symlink pointing to the TEMPLATE utilities |
| egs2/ipapack_plus/s2t1/scripts | Symlink pointing to the TEMPLATE scripts |
| egs2/ipapack_plus/s2t1/s2t.sh | Symlink for S2T script from TEMPLATE |
| egs2/ipapack_plus/s2t1/run.sh | New run script setting up training and inference parameters |
| egs2/ipapack_plus/s2t1/pyscripts | Symlink pointing to the TEMPLATE Python scripts |
| egs2/ipapack_plus/s2t1/path.sh | Symlink pointing to the TEMPLATE path script |
| egs2/ipapack_plus/s2t1/local/utils.py | Utility file defining symbols, language tokens, and phoneme vocabulary |
| egs2/ipapack_plus/s2t1/db.sh | Symlink pointing to the TEMPLATE database script |
| egs2/ipapack_plus/s2t1/conf/* | New configuration files for tuning, slurm, queue, pitch, pbs, fbank, and decode setups |
| egs2/ipapack_plus/s2t1/cmd.sh | Command management script for job scheduling and execution |
| egs2/ipapack_plus/s2t1/README.md | Recipe documentation including data prep and training guidelines |
| egs2/TEMPLATE/s2t1/s2t.sh | Updates to add the --bpe_largecorpus option for large corpus support |
| egs2/TEMPLATE/asr1/db.sh | Database script update to include IPAPack++ dataset entry |
| egs2/README.md | General dataset registration extended to support IPAPack++ |
sw005320
reviewed
Jun 27, 2025
sw005320
reviewed
Jun 27, 2025
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new recipe for the IPAPack++ dataset in the ESPnet2 framework, along with several enhancements and configurations to support its usage. The changes include the addition of dataset-specific configurations, utilities, and scripts, as well as updates to the general framework to handle large corpora and improve flexibility in training.
IPAPack++ Recipe Additions:
egs2/ipapack_plus/s2t1/containing scripts and configurations for the IPAPack++ dataset, includingREADME.md,cmd.sh,conf/,local/utils.py,path.sh, andpyscripts. These files provide the structure and guidelines for training and evaluation on IPAPack++. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]Framework Enhancements:
--bpe_largecorpusinegs2/TEMPLATE/s2t1/s2t.shto enable training on extremely large corpora. This includes associated logic for handling large datasets during BPE training. [1] [2] [3]Dataset Integration:
egs2/README.mdand updatedegs2/TEMPLATE/asr1/db.shto include a placeholder for its download directory. [1] [2]Configuration Updates:
fbank.confandpitch.conf) and decoding (decode_s2t_pr.yaml) tailored for the IPAPack++ dataset. [1] [2] [3]train_s2t_ebf_conv2d_size768_e9_d9_piecewise_lr5e-4_warmup60k_flashattn.yamloptimized for multilingual phone recognition tasks.Utilities and Symbols:
local/utils.pywith definitions for shared symbols, task tokens, supported languages, and phoneme vocabulary to streamline multilingual and phoneme-based processing.What did you change?
New s2t recipe for IPAPack++
Why did you make this change?
Provide a basic setup for developing a multitask phone recognition model
Is your PR small enough?
Yes
Additional Context
Related to #6169