Codestin Search App

pengchengguo · 2023-07-22T11:58:21Z

For the asr.sh script, directly use --lang as the language id to export the Whisper vocabulary.
For the training procedure, add an additional tokenizer_language for the preprocessor in config files, like

preprocessor: default
preprocessor_conf:
    tokenizer_language: "zh"

Add the fine-tuning results on the Aishell corpus. When compared with other methods, fine-tuning Whisper achieves the best results.

sw005320

LGTM.
Please also add some tests.

sw005320 · 2023-07-22T19:11:08Z

egs2/aishell/asr1/README.md

+
+- ASR config: [conf/tuning/train_asr_whisper_medium_finetune.yaml](conf/tuning/train_asr_whisper_medium_finetune.yaml)
+- #Params: 762.32 M
+- Model link:


As we discussed, please upload a model.

simpleoier

Thanks!
I only have one concern about unseen lang_id used in whisper.

simpleoier · 2023-07-23T13:42:07Z

egs2/aishell/asr1/README.md

+
+## Results
+
+- ASR config: [conf/tuning/train_asr_whisper_medium_finetune.yaml](conf/tuning/train_asr_whisper_medium_finetune.yaml)


Is decode config needed here?

Thanks, it should be included.

simpleoier · 2023-07-23T13:42:38Z

egs2/TEMPLATE/asr1/asr.sh

        fi
+
+        _opts=""
+        if [ "${token_type}" = "whisper_multilingual" ]; then


Would default lang=noinfo work here?

I add a LANGUAGES_CODE_MAPPING to map the language codes of ESPnet with the language IDs of Whisper and make sure the input language code is supported by the Whiper models.

simpleoier · 2023-07-23T13:43:37Z

espnet2/bin/asr_inference.py

        else:
-            converter = OpenAIWhisperTokenIDConverter(model_type=bpemodel)
+            converter = OpenAIWhisperTokenIDConverter(
+                model_type=bpemodel, language=tokenizer_language


Can we specify any language id here or only the langs supported in Whisper model?

mergify · 2023-07-24T11:25:14Z

This pull request is now in conflict :(

pengchengguo · 2023-07-24T11:26:08Z

I have made several updates as discussed:

Included the Decode Config as @simpleoier mentioned; (see asr1/README.md)
Updated the HF model link and noted that the model size is very large; (see asr1/README.md)
Added a start recipe for whisper fine-tuning; (see asr1/run_whisper_finetune.sh)
Added arbitrary language evaluation with the original Whisper models. Excluded the language check in this part as it will be done in the Whisper code; (see pyscripts/utils/evaluate_whisper_inference.py and scripts/evaluate_asr.sh);
Added a LANGUAGES_CODE_MAPPING to map the language codes of ESPnet with the language IDs of Whisper. For the languages included in ESPnet, I tried to find as many language mappings as possible, and we can maintain the mapping dictionary in the future. (see espnet2/text/whisper_tokenizer.py)
Determined whether the Whisper model supports the input language code, and if not, raised a ValueError; (see espnet2/bin/whisper_export_vocabluray.py, espnet2/text/whisper_token_id_converter.py, and espnet2/text/whisper_tokenizer.py)
Fixed CI test errors; (see espnet2/bin/asr_inference.py)

Thanks! I only have one concern about unseen lang_id used in whisper.

Currently, for unseen language IDs that the Whisper models do not support, we will raise a ValueError.

for more information, see https://pre-commit.ci

codecov · 2023-07-24T13:03:01Z

Codecov Report

Merging #5344 (75e844d) into master (4847b5f) will decrease coverage by 6.33%.
The diff coverage is 82.60%.

@@            Coverage Diff             @@
##           master    #5344      +/-   ##
==========================================
- Coverage   76.11%   69.79%   -6.33%     
==========================================
  Files         672      671       -1     
  Lines       59864    59793      -71     
==========================================
- Hits        45567    41733    -3834     
- Misses      14297    18060    +3763

Flag	Coverage Δ
test_configuration_espnet2	`∅ <ø> (∅)`
test_integration_espnet1	`65.93% <ø> (ø)`
test_integration_espnet2	`47.92% <31.25%> (-0.01%)`	⬇️
test_python_espnet1	`?`
test_python_espnet2	`51.36% <82.60%> (+<0.01%)`	⬆️
test_utils	`23.17% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/text/build_tokenizer.py	`78.37% <0.00%> (ø)`
espnet2/train/preprocessor.py	`44.83% <ø> (ø)`
espnet2/bin/asr_inference.py	`87.46% <40.00%> (-0.45%)`	⬇️
espnet2/bin/whisper_export_vocabulary.py	`91.83% <100.00%> (+0.92%)`	⬆️
espnet2/text/whisper_token_id_converter.py	`85.18% <100.00%> (+2.57%)`	⬆️
espnet2/text/whisper_tokenizer.py	`85.71% <100.00%> (+2.38%)`	⬆️

... and 113 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

simpleoier · 2023-07-24T19:04:26Z

egs2/TEMPLATE/asr1/asr.sh

        ${python} -m espnet2.bin.whisper_export_vocabulary  \
            --whisper_model "${token_type}" \
-            --output "${token_list}"
+            --output "${token_list}" ${_opts}


I think it would be useful if the script can exit when ${lang} is not recognized. Is this already satisfied? Or just by adding || exit 1 after this line. I'm not sure if the python scripts satisfies the condition of retuning non-zero status.

Unfortunately, the shell script will directly terminate when whisper_export_vocabulary raises the "ValueError: language unsupported for Whisper model". I am not sure how to handle this situation.
Using || exit 1 or not does not affect the shell script.

simpleoier

LGTM! Thanks!

sw005320 · 2023-08-03T19:34:32Z

Thanks, @pengchengguo!

pengchengguo added 2 commits July 11, 2023 17:07

Support arbitrary language finetune for the Whisper model.

9cdf8f2

Add Aishell Whisper finetune configs and results.

2ba6b8b

mergify bot added ESPnet2 README labels Jul 22, 2023

sw005320 added the Need review label Jul 22, 2023

sw005320 requested a review from simpleoier July 22, 2023 14:37

sw005320 added this to the v.202307 milestone Jul 22, 2023

sw005320 added New Features ASR Automatic speech recogntion labels Jul 22, 2023

sw005320 reviewed Jul 22, 2023

View reviewed changes

pengchengguo removed the Need review label Jul 23, 2023

Fix CI test error.

8503eb7

pengchengguo added the Need review label Jul 23, 2023

simpleoier reviewed Jul 23, 2023

View reviewed changes

Fix the CI test error and make some updates to the PR.

ae177f6

mergify bot added the conflicts label Jul 24, 2023

pengchengguo added 2 commits July 24, 2023 19:28

Merge branch 'master' of github.com:espnet/espnet

2ead79e

Merge from the master branch to solve conficts.

1418095

mergify bot removed the conflicts label Jul 24, 2023

pre-commit-ci bot and others added 2 commits July 24, 2023 11:51

[pre-commit.ci] auto fixes from pre-commit.com hooks

23eab19

for more information, see https://pre-commit.ci

Fix shell test error.

172aad1

Add CI test blocks.

75e844d

simpleoier reviewed Jul 24, 2023

View reviewed changes

kan-bayashi modified the milestones: v.202307, v.202312 Aug 3, 2023

simpleoier approved these changes Aug 3, 2023

View reviewed changes

sw005320 merged commit 093a315 into espnet:master Aug 3, 2023

mukherjeesougata-eros mentioned this pull request Oct 20, 2023

How to finetune whisper using my own dataset? #5493

Closed


		## Results

		- ASR config: [conf/tuning/train_asr_whisper_medium_finetune.yaml](conf/tuning/train_asr_whisper_medium_finetune.yaml)

Conversation

pengchengguo commented Jul 22, 2023

Uh oh!

sw005320 left a comment

Choose a reason for hiding this comment

Uh oh!

sw005320 Jul 22, 2023

Choose a reason for hiding this comment

Uh oh!

simpleoier left a comment

Choose a reason for hiding this comment

Uh oh!

simpleoier Jul 23, 2023

Choose a reason for hiding this comment

Uh oh!

pengchengguo Jul 24, 2023

Choose a reason for hiding this comment

Uh oh!

simpleoier Jul 23, 2023

Choose a reason for hiding this comment

Uh oh!

pengchengguo Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simpleoier Jul 23, 2023

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jul 24, 2023

Uh oh!

pengchengguo commented Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

simpleoier Jul 24, 2023

Choose a reason for hiding this comment

Uh oh!

pengchengguo Jul 25, 2023

Choose a reason for hiding this comment

Uh oh!

simpleoier left a comment

Choose a reason for hiding this comment

Uh oh!

sw005320 commented Aug 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pengchengguo Jul 24, 2023 •

edited

Loading

pengchengguo commented Jul 24, 2023 •

edited

Loading

codecov bot commented Jul 24, 2023 •

edited

Loading