Codestin Search App

Qingzheng-Wang · 2025-06-19T06:37:08Z

What did you change?

Added a new LID recipe template under egs2/TEMPLATE/lid1, including config files and basic run scripts for reproducible experiments.

Why did you make this change?

This provides a ready-to-use starting point for LID experiments, aligned with ESPnet2 standards.

Is your PR small enough?

This PR is slightly over 1000 lines, but keeping it unified is better for clarity since the components are tightly coupled.

Additional Context

Links to previous LID PRs
Depends on:
Actual dataset-specific recipes (e.g., VoxLingua107) will be submitted in follow-up PRs

for more information, see https://pre-commit.ci

codecov · 2025-06-19T06:58:36Z

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.82%. Comparing base (fc8e461) to head (dcadf5b).
⚠️ Report is 8 commits behind head on master.

Files with missing lines	Patch %	Lines
espnet2/bin/lid_inference.py	0.00%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #6160   +/-   ##
=======================================
  Coverage   55.82%   55.82%           
=======================================
  Files         884      884           
  Lines       84012    84012           
=======================================
  Hits        46903    46903           
  Misses      37109    37109

Flag	Coverage Δ
test_integration_espnet2	`46.16% <ø> (ø)`
test_integration_espnetez	`36.94% <ø> (ø)`
test_python_espnet2	`50.51% <60.00%> (ø)`
test_python_espnetez	`12.83% <0.00%> (ø)`
test_utils	`18.77% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sw005320 · 2025-06-19T11:34:27Z

This pull request introduces a new template for spoken language identification (lid1) in ESPnet2. The changes include detailed documentation, configuration files, scripts, and utilities to support the language identification pipeline. Key updates focus on recipe flow, job scheduling, scoring, and data preparation.

Recipe and Documentation Updates:

egs2/TEMPLATE/lid1/README.md: Added comprehensive documentation for the lid1 recipe, detailing the pipeline stages, data preparation, model training, inference, scoring, and visualization. Includes an example for VoxLingua107 training.

Job Scheduling and Configuration:

egs2/TEMPLATE/lid1/cmd.sh: Introduced backend selection for job scheduling systems (local, slurm, sge, etc.), enabling flexible job execution environments.
Configuration files (pbs.conf, queue.conf, slurm.conf): Added default settings for job scheduling systems, including memory, GPU, and thread specifications. [1] [2] [3]

Scoring and Evaluation:

egs2/TEMPLATE/lid1/local/score.py: Implemented scoring script to calculate accuracy, macro accuracy, and error frequencies for language identification predictions. Supports detailed error analysis.

Data Preparation and Utilities:

egs2/TEMPLATE/lid1/local/copy_data_dir.sh: Added script to copy and modify Kaldi-style data directories for language identification tasks, with options for prefix/suffix adjustments.
egs2/TEMPLATE/lid1/db.sh: Defined paths for datasets, enabling automatic downloads for supported corpora.

Symlinks and Setup:

Symlinks (pyscripts, scripts, steps, utils) and setup.sh: Created symlinks to shared resources and added setup script for initializing the lid1 recipe directory. [1] [2] [3] [4] [5]

Copilot

Pull Request Overview

This PR adds a new LID recipe template under egs2/TEMPLATE/lid1 with configuration files, run scripts, and utility scripts for reproducible LID experiments following ESPnet2 standards.

Added symlinked utility folders (utils, steps, scripts, pyscripts) referencing the asr1 recipe.
Introduced setup, scoring, and data copying scripts along with various job scheduler configuration files.

Reviewed Changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
egs2/TEMPLATE/lid1/utils	Points to asr1 utilities via a relative path.
egs2/TEMPLATE/lid1/steps	Points to asr1 steps via a relative path.
egs2/TEMPLATE/lid1/setup.sh	Adds environment setup with copying and symlink creation logic.
egs2/TEMPLATE/lid1/scripts	Points to asr1 scripts via a relative path.
egs2/TEMPLATE/lid1/pyscripts	Points to asr1 pyscripts via a relative path.
egs2/TEMPLATE/lid1/path.sh	Sets up the PATH and other environment variables.
egs2/TEMPLATE/lid1/local/score.py	Provides scoring logic for LID experiments.
egs2/TEMPLATE/lid1/local/copy_data_dir.sh	Copies and maps data files with optional prefix/suffix modifications.
egs2/TEMPLATE/lid1/db.sh	Declares download paths for various corpora.
egs2/TEMPLATE/lid1/conf/*	Contains configuration files for different schedulers.
egs2/TEMPLATE/lid1/cmd.sh	Configures command execution strategy for different backends.
egs2/TEMPLATE/lid1/README.md	Documents the LID recipe flow and user instructions.

Comments suppressed due to low confidence (1)

egs2/TEMPLATE/lid1/local/copy_data_dir.sh:119

The variable 'spk_map' is used here, but it is never defined or created in the script. Consider using 'lang_map' or ensuring that 'spk_map' is properly initialized before this step.

  utils/apply_map.pl -f 1 $destdir/spk_map <$srcdir/spk2gender >$destdir/spk2gender

egs2/TEMPLATE/lid1/setup.sh

for more information, see https://pre-commit.ci

Change local files from copy from template, to symbolink from the template.

for more information, see https://pre-commit.ci

sw005320 · 2025-08-28T14:57:59Z

@ftshijt, can you review this PR?
I took a look at this and it looks good to me.

Fhrozen · 2025-08-29T11:48:17Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive new recipe template for Language Identification (LID), which is a great addition. The structure follows existing ESPnet conventions. However, my review identified several critical and high-severity issues. These include a script that will fail due to a missing file, bugs in the main recipe script related to incorrect logic and filenames, and the use of fragile file-renaming workarounds that could leave the system in an inconsistent state. Addressing these issues is crucial for ensuring the robustness and correctness of this new template.

egs2/TEMPLATE/lid1/setup.sh

egs2/TEMPLATE/lid1/lid.sh

gemini-code-assist · 2025-08-29T11:50:12Z

egs2/TEMPLATE/lid1/lid.sh

+        mv "data/${train_set}/utt2lang" "data/${train_set}/utt2spk"
+        mv "data/${train_set}/lang2utt" "data/${train_set}/spk2utt"


Temporarily renaming utt2lang to utt2spk and lang2utt to spk2utt is a fragile practice. If the script fails during the speed perturbation stage, the data directory will be left in an inconsistent state. A more robust approach would be to modify local/perturb_lid_data_dir_speed.sh to handle utt2lang and lang2utt files directly, which would eliminate the need for these mv commands.

I can modify the perturb_lid_data_dir_speed.sh script to replace utt2spk with utt2lang and spk2utt with lang2utt. However, lid.sh still relies on many util scripts that are originally designed for speaker-based setups, such as utils/combine_data.sh, utils/data/fix_data_dir.sh, utils/validate_data_dir.sh, and utils/subset_data_dir.sh. This leaves us with two options:

before using these scripts, mv ./utt2lang ./utt2spk, mv ./lang2utt ./spk2utt, and use these scripts without any modifications, then move files back to lang mv ./utt2spk ./utt2lang mv ./spk2utt ./lang2utt

Pros: No need to duplicate or maintain modified versions of existing scripts.

Cons: If execution stops between renaming steps, the directory may become inconsistent. That said, such cases are rare, and users can simply clean up and rerun the stage if needed.

modifies these scripts, copy them from the utils/ directory into the local folder (e.g., cp utils/fix_data_dir.sh local/fix_data_dir.sh), then modify all the utt2spk/spk2utt references in the copied version to utt2lang/lang2utt, and call these modified versions in the recipe.

Pros: Keeps file naming (utt2lang/lang2utt) consistent at all times.

Cons: Creates repetitive code, and any updates to the upstream utils scripts would need to be manually synced.

Although option 1 may cause file inconsistency if the process breaks between renaming steps, I think such cases are rare, and users can simply delete the generated files and rerun the stage. Option 2, on the other hand, would create a lot of repetitive code, which may not be necessary. Therefore, I prefer to use option 1, and in other lid1 recipes, when I create data, I will also follow this style.

egs2/TEMPLATE/lid1/lid.sh

gemini-code-assist · 2025-08-29T11:50:13Z

egs2/TEMPLATE/lid1/local/copy_data_dir.sh

+mv $destdir/lang2utt $destdir/spk2utt
+mv $destdir/utt2lang $destdir/utt2spk
+utils/validate_data_dir.sh $validate_opts $destdir
+mv $destdir/spk2utt $destdir/lang2utt
+mv $destdir/utt2spk $destdir/utt2lang


Renaming lang2utt to spk2utt and utt2lang to utt2spk just to run validate_data_dir.sh is a fragile approach. If the validation script fails, the data directory will be left in an inconsistent state. To improve robustness, please consider creating a LID-specific validation script or adapting validate_data_dir.sh to avoid these temporary file moves.

The reason for keeping this style is the same as with perturb_lid_data_dir_speed.sh discussed earlier; please refer to the previous conversation.

gemini-code-assist · 2025-08-29T11:50:13Z

egs2/TEMPLATE/lid1/local/perturb_lid_data_dir_speed.sh

+if [[ ! -f ${srcdir}/utt2spk ]]; then
+  echo "$0: no such file ${srcdir}/utt2spk"
+  exit 1;


This script is intended for LID data but expects Kaldi-style speaker information files (utt2spk) instead of language information files (utt2lang). This forces the main lid.sh script to use a fragile mv hack. Please modify this script to work directly with utt2lang and lang2utt to improve robustness. You should check for utt2lang instead of utt2spk.

Suggested change

if [[ ! -f ${srcdir}/utt2spk ]]; then

echo "$0: no such file ${srcdir}/utt2spk"

exit 1;

if [[ ! -f ${srcdir}/utt2lang ]]; then

echo "$0: no such file ${srcdir}/utt2lang"

exit 1;

fi

No, before using this script, the file has been renamed to from utt2lang.sh to utt2spk.sh.

For more details, please refer to previous comments on perturb_lid_data_dir_speed.sh.

sw005320 · 2025-08-29T21:57:08Z

@Qingzheng-Wang, please check Gemni's reviews and reflect on them

@ftshijt, can you review this PR?

Qingzheng-Wang · 2025-09-01T05:12:05Z

@Qingzheng-Wang, please check Gemni's reviews and reflect on them

@ftshijt, can you review this PR?

Yes, I’ve just addressed Gemni’s comments.

for more information, see https://pre-commit.ci

ftshijt

The current setup looks good to me in general. I just have a few potential addition that might be interesting to consider as a follow up

whether we can consider multiple language in a single utterances (related to the code-switching or long audio processing scenarios)
Please also include the lid task in mini_an4 to integrate to the CI

ftshijt · 2025-09-02T04:31:31Z

egs2/TEMPLATE/lid1/lid.sh

+            for x in music noise speech; do
+                if [ -f data/musan_${x}.scp ]; then
+                    cp data/musan_${x}.scp ${data_feats}/musan_${x}.scp
+                fi


Do we need to fix the musan usage for augmentation (e.g., if we need other noise database for augmentation)

We can consider making this more flexible in the future. For now, it supports using Musan. In my experiments, the performance remains strong even without it.

ftshijt · 2025-09-02T04:33:09Z

egs2/TEMPLATE/lid1/local/score.py

+    accuracy = correct / total
+    accuracy_per_lang = {lang: lang_correct[lang] / lang_total[lang] for lang in langs}
+    macro_accuracy = sum(accuracy_per_lang.values()) / len(langs)


while for balanced set, accuracy would be good. Shall we also consider precision/recall etc.?

Good suggestion. I've just added precision, recall, and F1 score computation.

Fhrozen · 2025-09-02T07:01:38Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new recipe template for Language Identification (LID). The changes are comprehensive, adding all the necessary scripts and configurations for a new lid1 task. My review focuses on improving the robustness and correctness of the new scripts. I've identified several areas for improvement:

The shell scripts use fragile mv commands to temporarily rename files to be compatible with existing utility scripts. This can leave the data directory in an inconsistent state if the script is interrupted. I've suggested modifying the utility scripts to handle LID-specific files directly.
Some Python scripts do not correctly handle utterance IDs that may contain spaces, which could lead to parsing errors. I've provided suggestions to fix this.
The setup.sh script references a file that is missing from the pull request, which would cause it to fail.

Overall, this is a great addition, and addressing these points will make the new recipe template more robust and reliable.

egs2/TEMPLATE/lid1/setup.sh

egs2/TEMPLATE/lid1/lid.sh

egs2/TEMPLATE/lid1/local/copy_data_dir.sh

egs2/TEMPLATE/lid1/local/prepare_ood_test.py

egs2/TEMPLATE/lid1/local/score.py

for more information, see https://pre-commit.ci

Qingzheng-Wang · 2025-09-02T19:19:28Z

The current setup looks good to me in general. I just have a few potential addition that might be interesting to consider as a follow up

whether we can consider multiple language in a single utterances (related to the code-switching or long audio processing scenarios)

Please also include the lid task in mini_an4 to integrate to the CI

Thank you for the review!

Currently we only support one language per utterance. For code-switching cases or long audio scenarios, I think this would require language diarization techniques, which is an interesting direction to explore in the future.

I've also added the mini_an4 lid1 recipe in PR #6210 .

sw005320 · 2025-09-09T14:17:08Z

Thanks!
Please move to the next one.

Qingzheng-Wang added 5 commits June 19, 2025 02:32

Add lid score.

ab22e60

Add recipe readme.

e47c409

Add lid recipe template.

946e908

Add lid script.

98c5894

Add adapted copy data dir for lid task.

346e218

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2025

mergify bot added ESPnet2 README labels Jun 19, 2025

dosubot bot added the Recipe label Jun 19, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

041f39a

for more information, see https://pre-commit.ci

sw005320 requested a review from Copilot June 19, 2025 11:34

Copilot AI reviewed Jun 19, 2025

View reviewed changes

egs2/TEMPLATE/lid1/setup.sh Outdated Show resolved Hide resolved

Qingzheng-Wang and others added 11 commits June 19, 2025 21:04

Fix targets dir.

09c91f2

Re CI.

8610856

[pre-commit.ci] auto fixes from pre-commit.com hooks

0bf9775

for more information, see https://pre-commit.ci

Change lid_inference_dist to lid_inference.

ae6bb29

Fix inference log names.

fa8f258

Fix comment.

da5f5e7

Add voxlingua107 recipe.

61131a3

Add readme.

7da2d94

Add run script.

2a52c39

Add config.

86e6d7f

Add data preparation scritpts.

48852e4

Qingzheng-Wang and others added 5 commits August 19, 2025 21:20

Fix: remove spk.

2e2059d

Fix local file copy.

d7523d7

Change local files from copy from template, to symbolink from the template.

Fix lengthy lines.

fd1c90e

[pre-commit.ci] auto fixes from pre-commit.com hooks

0c3178c

for more information, see https://pre-commit.ci

Merge branch 'master' into lid_release6

7709174

Merge branch 'master' into lid_release6

cd374d7

Fhrozen closed this Aug 29, 2025

Fhrozen reopened this Aug 29, 2025

gemini-code-assist bot reviewed Aug 29, 2025

View reviewed changes

Qingzheng-Wang and others added 3 commits August 31, 2025 20:31

Merge branch 'master' into lid_release6

27815d4

Fixed run generation.

6934c16

Fixed check.

4c508dd

Qingzheng-Wang closed this Sep 1, 2025

Qingzheng-Wang reopened this Sep 1, 2025

Qingzheng-Wang and others added 2 commits August 31, 2025 22:20

Re CI.

65f1047

[pre-commit.ci] auto fixes from pre-commit.com hooks

fc8e461

for more information, see https://pre-commit.ci

ftshijt reviewed Sep 2, 2025

View reviewed changes

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

Qingzheng-Wang and others added 3 commits September 2, 2025 12:13

Add precision, recall, and F1 metrics.

d304e91

Merge branch 'master' into lid_release6

22d7933

[pre-commit.ci] auto fixes from pre-commit.com hooks

dcadf5b

for more information, see https://pre-commit.ci

sw005320 merged commit 80bf926 into espnet:master Sep 9, 2025
34 checks passed

Fhrozen mentioned this pull request Sep 11, 2025

Version Release #6236

Merged

		mv "data/${train_set}/utt2lang" "data/${train_set}/utt2spk"
		mv "data/${train_set}/lang2utt" "data/${train_set}/spk2utt"

Conversation

Qingzheng-Wang commented Jun 19, 2025

What did you change?

Why did you make this change?

Is your PR small enough?

Additional Context

Uh oh!

codecov bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sw005320 commented Jun 19, 2025

Recipe and Documentation Updates:

Job Scheduling and Configuration:

Scoring and Evaluation:

Data Preparation and Utilities:

Symlinks and Setup:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

sw005320 commented Aug 28, 2025

Uh oh!

Fhrozen commented Aug 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Qingzheng-Wang Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Qingzheng-Wang Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Qingzheng-Wang Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

sw005320 commented Aug 29, 2025

Uh oh!

Qingzheng-Wang commented Sep 1, 2025

Uh oh!

ftshijt left a comment

Choose a reason for hiding this comment

Uh oh!

ftshijt Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Qingzheng-Wang Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

ftshijt Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Qingzheng-Wang Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Fhrozen commented Sep 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

codecov bot commented Jun 19, 2025 •

edited

Loading

Qingzheng-Wang commented Sep 2, 2025 •

edited

Loading