[espnet3-6] Add evaluation scripts#6178
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a variety of evaluation scripts and configuration updates for the ESPnet3 evaluation and inference pipelines. Key changes include:
- New test utilities and configuration files for score evaluation and dataloader setup.
- Several integration tests and unit tests for inference, score runners, trainers, hybrid optimizers, and dataloader builders.
- Removal of legacy modules in the espnetez package to streamline the codebase.
Reviewed Changes
Copilot reviewed 72 out of 75 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test_utils/espnet3/stats/stats_dummy | Adds a dummy stats file used for testing dataloader configuration. |
| test_utils/espnet3/scores/* | Adds new score files for testing sorted, ID mismatches, and missing files. |
| test_utils/espnet3/config/*.yaml | Updates to configuration files for scoring tasks and dataloader settings. |
| test/espnet3/* | New integration and unit tests covering inference runners, trainers, model schedulers, hybrid optimizers and dataloader builders. |
| espnetez/* | Removal of outdated or redundant modules including trainer, task, preprocess/tokenizer, sentencepiece, dataset, and dataloader. |
| valid: | ||
| iter_factory: | ||
| batch_size: 2 | ||
| num_workers": 0 |
There was a problem hiding this comment.
The key 'num_workers"' appears to include an extra quote. It should be corrected to 'num_workers: 0' to follow proper YAML syntax.
| num_workers": 0 | |
| num_workers: 0 |
|
This pull request is now in conflict :( |
for more information, see https://pre-commit.ci
|
Please fix the CI error |
|
The CI error for espnet2 package comes from huggingface side. |
for more information, see https://pre-commit.ci
|
Please check #6261 Can you import the espnet master now in another PR? |
|
Thank you, I will merge master into espnet3 branch! |
|
@sw005320 |
|
When using this runners I found a issue in running parallel processing in sync mode. (rank is not passed via environment variable) |
|
Can you make the test time shorter? |
- Cancel all futures when one future got error - Close local clients after the job
for more information, see https://pre-commit.ci
- For local clusters it is not properly closed and they remain alive during tests..
…pnet into espnet3/evaluation_stage
|
Thanks! |
What did you change?
Added a new inference module under
espnet3/inference/, including:abs_metrics.py: Abstract base class for evaluation metrics.inference_runner.py: A comprehensive, configurable inference engine supporting streaming and parallelism.score_runner.py: Evaluation runner for decoding outputs using custom metrics.Introduced
espnet3/task.pywith helpers for task-specific model instantiation (get_espnet_model) and configuration saving.Why did you make this change?
To modularize and generalize the inference and evaluation pipelines in ESPnet3:
Is your PR small enough?
Yes.
It has many files under
test_utils/espnet3/, which is mainly used for invalid test cases.Additional Context
Note: Since I couldn't find a way to run CI with parallel processing, this PR does not contain unit tests.