Thanks to visit codestin.com
Credit goes to github.com

Skip to content

S2T Recipe for IPAPack++: main recipe#6168

Merged
sw005320 merged 13 commits intoespnet:masterfrom
chinjouli:ipapack_recipe
Jun 30, 2025
Merged

S2T Recipe for IPAPack++: main recipe#6168
sw005320 merged 13 commits intoespnet:masterfrom
chinjouli:ipapack_recipe

Conversation

@chinjouli
Copy link
Contributor

@chinjouli chinjouli commented Jun 23, 2025

This pull request introduces a new recipe for the IPAPack++ dataset in the ESPnet2 framework, along with several enhancements and configurations to support its usage. The changes include the addition of dataset-specific configurations, utilities, and scripts, as well as updates to the general framework to handle large corpora and improve flexibility in training.

IPAPack++ Recipe Additions:

  • New Recipe Directory: Added a new directory egs2/ipapack_plus/s2t1/ containing scripts and configurations for the IPAPack++ dataset, including README.md, cmd.sh, conf/, local/utils.py, path.sh, and pyscripts. These files provide the structure and guidelines for training and evaluation on IPAPack++. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

Framework Enhancements:

  • Support for Large Corpus in BPE Training: Introduced a new option --bpe_largecorpus in egs2/TEMPLATE/s2t1/s2t.sh to enable training on extremely large corpora. This includes associated logic for handling large datasets during BPE training. [1] [2] [3]

Dataset Integration:

  • IPAPack++ Dataset Registration: Added the IPAPack++ dataset to the list of supported datasets in egs2/README.md and updated egs2/TEMPLATE/asr1/db.sh to include a placeholder for its download directory. [1] [2]

Configuration Updates:

  • Audio and Feature Extraction: Added configurations for feature extraction (fbank.conf and pitch.conf) and decoding (decode_s2t_pr.yaml) tailored for the IPAPack++ dataset. [1] [2] [3]
  • Training Configurations: Introduced a new training configuration file train_s2t_ebf_conv2d_size768_e9_d9_piecewise_lr5e-4_warmup60k_flashattn.yaml optimized for multilingual phone recognition tasks.

Utilities and Symbols:

  • Language and Phoneme Support: Added a utility file local/utils.py with definitions for shared symbols, task tokens, supported languages, and phoneme vocabulary to streamline multilingual and phoneme-based processing.

What did you change?

New s2t recipe for IPAPack++


Why did you make this change?

Provide a basic setup for developing a multitask phone recognition model


Is your PR small enough?

Yes


Additional Context

Related to #6169

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jun 23, 2025
@dosubot dosubot bot added the Recipe label Jun 23, 2025
@sw005320 sw005320 requested a review from Copilot June 23, 2025 17:34
@sw005320 sw005320 added this to the v.202506 milestone Jun 23, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a new S2T recipe for the IPAPack++ dataset and integrates it into the existing ESPnet2 framework while also enhancing related training and configuration options.

  • New recipe directory with dataset-specific utilities, scripts, and configurations
  • Integration of IPAPack++ into dataset registration and ASR database
  • Enhancement of BPE training support by introducing the --bpe_largecorpus option

Reviewed Changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.

Show a summary per file
File Description
egs2/ipapack_plus/s2t1/utils Symlink pointing to the TEMPLATE utilities
egs2/ipapack_plus/s2t1/scripts Symlink pointing to the TEMPLATE scripts
egs2/ipapack_plus/s2t1/s2t.sh Symlink for S2T script from TEMPLATE
egs2/ipapack_plus/s2t1/run.sh New run script setting up training and inference parameters
egs2/ipapack_plus/s2t1/pyscripts Symlink pointing to the TEMPLATE Python scripts
egs2/ipapack_plus/s2t1/path.sh Symlink pointing to the TEMPLATE path script
egs2/ipapack_plus/s2t1/local/utils.py Utility file defining symbols, language tokens, and phoneme vocabulary
egs2/ipapack_plus/s2t1/db.sh Symlink pointing to the TEMPLATE database script
egs2/ipapack_plus/s2t1/conf/* New configuration files for tuning, slurm, queue, pitch, pbs, fbank, and decode setups
egs2/ipapack_plus/s2t1/cmd.sh Command management script for job scheduling and execution
egs2/ipapack_plus/s2t1/README.md Recipe documentation including data prep and training guidelines
egs2/TEMPLATE/s2t1/s2t.sh Updates to add the --bpe_largecorpus option for large corpus support
egs2/TEMPLATE/asr1/db.sh Database script update to include IPAPack++ dataset entry
egs2/README.md General dataset registration extended to support IPAPack++

@sw005320
Copy link
Contributor

@sw005320 sw005320 merged commit 333b6f7 into espnet:master Jun 30, 2025
38 checks passed
@chinjouli chinjouli deleted the ipapack_recipe branch October 27, 2025 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ESPnet2 README Recipe size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants