Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add espnetez#5372

Merged
sw005320 merged 34 commits intoespnet:masterfrom
Masao-Someki:feature/espnetez
Dec 9, 2023
Merged

Add espnetez#5372
sw005320 merged 34 commits intoespnet:masterfrom
Masao-Someki:feature/espnetez

Conversation

@Masao-Someki
Copy link
Contributor

What?

This PR adds espnetez package to make it easier to use ESPnet!

Why?

ESPnet runs primarily with a shell script, which can be difficult for beginners to use for running all the stages. The espnetez tool provides a Pythonic frontend for users, making it more user-friendly.

Masao-Someki and others added 2 commits July 26, 2023 00:51
- ESPnet becomes super simple!
@sw005320 sw005320 added this to the v.202307 milestone Jul 25, 2023
@Masao-Someki Masao-Someki changed the title Add espnetez [WIP] Add espnetez Jul 25, 2023
@Masao-Someki
Copy link
Contributor Author

This is the sample training script for the ASR task with espnetez.
All you need is to build a dump file and pass it to the trainer.
You can easily join several datasets by just joining the dump file.

import espnetez as ez

# dataset information
# the format of wav.scp is: <audio_tag><space><file_path>\n
# and the format of text is: <audio_tag><space><text>\n
# Example for wav.scp: audio_1 /database/libri100/train/first.flac
# Example for text: audio_1 HELLO WORLD
data_inputs = {
    "speech": { "file": "wav.scp", "type": "kaldi_ark" },
    "text":{ "file": "text", "type": "text" }
}
train_dump_path = "dump/raw/train_clean_100_sp"
test_dump_path = "dump/raw/test_clean"
output_path = "exp"

# You can use configuration from the ESPnet recipes.
training_config = ez.config.from_yaml("asr", "train_asr_branchformer_e24_amp.yaml")

# and you can update with your config.
preprocessor_config = ez.utils.load_yaml("preprocess.yaml")
training_config.update(preprocessor_config)

# Define trainer
trainer = ez.trainer.Trainer(
	"asr", train_dump_path, test_dump_path, output_path,
	data_inputs, training_config,
	ngpu=1 # you can also update configuration here
)

# If you don't have stats file then you need to run this.
trainer.collect_stats()

# finally run train()
trainer.train()

@Masao-Someki
Copy link
Contributor Author

Masao-Someki commented Jul 25, 2023

ToDO

  • Add more tasks. (Currently, I added asr/transducer/tts for debugging.)
  • Add sentencepiece training
  • Add frontend for creating dump files
  • Refactor trainer class. Especially the _update_config() function.

@sw005320 sw005320 added TTS Text-to-speech ASR Automatic speech recogntion labels Jul 25, 2023
@codecov
Copy link

codecov bot commented Jul 25, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (35b8f01) 76.54% compared to head (c788444) 76.54%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5372   +/-   ##
=======================================
  Coverage   76.54%   76.54%           
=======================================
  Files         720      720           
  Lines       66602    66602           
=======================================
  Hits        50978    50978           
  Misses      15624    15624           
Flag Coverage Δ
test_configuration_espnet2 ∅ <ø> (∅)
test_integration_espnet1 62.92% <ø> (ø)
test_integration_espnet2 50.10% <ø> (ø)
test_python_espnet1 19.08% <ø> (ø)
test_python_espnet2 52.38% <ø> (ø)
test_utils 22.15% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sw005320 sw005320 requested a review from pyf98 July 25, 2023 19:32
@sw005320
Copy link
Contributor

@pyf98, can you help this project by reviewing and testing PRs?

@kan-bayashi kan-bayashi modified the milestones: v.202307, v.202312 Aug 3, 2023
@sw005320
Copy link
Contributor

sw005320 commented Aug 3, 2023

@Masao-Someki, can you let me know the progress?
Can you fix the CI and finish refactoring?
I think we can make an LM training step a lower priority.

@Masao-Someki
Copy link
Contributor Author

@sw005320
I'm using this espnetez to train a single ASR model but encountering some issues during training.
For example, the uid variable in the dataloader/dataset becomes float type, and I got assertion error.
I'm not sure why it happens, and debugging if the dump file is properly generated.

@Masao-Someki
Copy link
Contributor Author

I added a demo notebook to train E-branchformer model with Librispeech-100 dataset. (link)
In my environment, the final train() function does not successfully executed on the Jupiter notebook. Maybe we need to run the train() function from command line.

@kan-bayashi kan-bayashi added this to the v.202312 milestone Oct 25, 2023
Masao-Someki and others added 4 commits November 9, 2023 02:19
- Easy task class will be used as the wrapper of AbsTask.
- It is to enable finetuning the pretrained model defined by user.
@Masao-Someki
Copy link
Contributor Author

Masao-Someki commented Nov 9, 2023

I included a demo notebook on fine-tuning the pre-trained model using LoRA.
ESPnet-Easy simplifies fine-tuning the pretrained model from the Hugging Face hub with a custom dataset.
(Currently, it seems that there is a bug in the training process with the pretrained model.)

@sw005320
Copy link
Contributor

sw005320 commented Nov 9, 2023

Very cool!

@juice500ml, @ftshijt, @simpleoier, can you check this?

@pyf98
Copy link
Collaborator

pyf98 commented Nov 9, 2023

Can we provide an example for fine-tuning OWSM (e.g., https://huggingface.co/espnet/owsm_v2_ebranchformer)? It will attract more users.

@juice500ml
Copy link
Contributor

This PR is awesome!! Can we also consider packaging for pypi, so that people can easily pip install espnetez and directly use this?

@mergify mergify bot added the ESPnet2 label Nov 11, 2023
@Masao-Someki
Copy link
Contributor Author

Thank you @sw005320, @pyf98, and @juice500ml,

I apologize for the delay in development, but the bug in the fine-tuning process has been successfully fixed.
This PR is now ready for review!

Since I currently have only one GPU on my local machine, I kindly request the reviewer's assistance in checking whether the training process runs successfully with multiple GPUs.

@Masao-Someki Masao-Someki changed the title [WIP] Add espnetez Add espnetez Nov 11, 2023
@classmethod
def main(cls, args: argparse.Namespace = None, cmd: Sequence[str] = None):
assert check_argument_types()
print(get_commandline_args(), file=sys.stderr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pyf98
I'm sorry, I accidentally deleted this line...
I just reverted this modification.

@pyf98
Copy link
Collaborator

pyf98 commented Dec 2, 2023

LGTM!

@Masao-Someki
Copy link
Contributor Author

Masao-Someki commented Dec 2, 2023

I will fix the CI and add an inference guide to the notebooks to finish this PR!

  • Fix CI
  • Add inference instruction in notebooks

@sw005320
Copy link
Contributor

sw005320 commented Dec 9, 2023

Thanks a lot, @Masao-Someki!
This is a great first step!

@Masao-Someki Masao-Someki deleted the feature/espnetez branch March 26, 2024 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR Automatic speech recogntion ESPnet2 New Features TTS Text-to-speech

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants