Codestin Search App

Emrys365 · 2023-02-09T09:09:32Z

This PR updates the chunk iterator to make it more general for different tasks.

I added a new argument excluded_key_prefixes for ChunkIterFactory to allow certain keys to be ignored when checking the length consistency for each sample:

            for key in sequence_keys:
-               if len(batch[key]) != len(
-                   batch[sequence_keys[0]]
-               ) and not key.startswith("enroll_ref"):
+               if key.startswith(self.excluded_key_prefixes):
+                   # ignore length inconsistency for `excluded_key_prefixes`
+                   continue
+               if len(batch[key]) != len(batch[sequence_keys[0]]):
                    raise RuntimeError(
                        f"All sequences must has same length: "
                        f"{len(batch[key])} != {len(batch[sequence_keys[0]])}"

This can be useful for audio-to-audio tasks like target speaker extraction (TSE), where additional features are required as input, which do not necessarily have the same length as the input/target signal.

In our previous discussion in the SE meeting, we decided to make a new chunk iterator for such purpose. But after some coding, I find the current minimal changes in espnet2/iterators/chunk_iter_factory.py are not so bad.

If you still feel it necessary to make a new script, I will do it.

codecov · 2023-02-09T10:13:30Z

Codecov Report

Merging #4929 (533b837) into master (1bed2f9) will increase coverage by 2.20%.
The diff coverage is 54.54%.

@@            Coverage Diff             @@
##           master    #4929      +/-   ##
==========================================
+ Coverage   74.79%   77.00%   +2.20%     
==========================================
  Files         606      606              
  Lines       53721    53761      +40     
==========================================
+ Hits        40183    41396    +1213     
+ Misses      13538    12365    -1173

Flag	Coverage Δ
test_integration_espnet1	`66.33% <ø> (ø)`
test_integration_espnet2	`47.59% <54.54%> (+0.09%)`	⬆️
test_python	`66.84% <54.54%> (+2.46%)`	⬆️
test_utils	`23.35% <ø> (+0.26%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/train/preprocessor.py	`29.19% <0.00%> (-0.39%)`	⬇️
espnet2/iterators/chunk_iter_factory.py	`82.17% <100.00%> (+12.17%)`	⬆️
espnet2/tasks/abs_task.py	`78.25% <100.00%> (+2.57%)`	⬆️
espnet2/train/trainer.py	`76.86% <0.00%> (-0.47%)`	⬇️
espnet2/uasr/espnet_model.py	`0.00% <0.00%> (ø)`
espnet/nets/pytorch_backend/e2e_vc_transformer.py	`86.72% <0.00%> (+0.11%)`	⬆️
espnet/nets/pytorch_backend/e2e_vc_tacotron2.py	`80.48% <0.00%> (+0.15%)`	⬆️
espnet/nets/chainer_backend/e2e_asr_transformer.py	`69.59% <0.00%> (+0.20%)`	⬆️
espnet/nets/pytorch_backend/lm/seq_rnn.py	`86.88% <0.00%> (+0.21%)`	⬆️
espnet2/bin/asr_transducer_inference.py	`94.04% <0.00%> (+0.39%)`	⬆️
... and 34 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

sw005320 · 2023-02-17T12:30:48Z

espnet2/iterators/chunk_iter_factory.py

        self.seed = seed
        self.shuffle = shuffle
+        self.excluded_key_pattern = (
+            "(" + "[0-9]*)|(".join(excluded_key_prefixes) + "[0-9]*)"


Are you only assuming the cases of numbers are appended?
If so, it should be documented in add_argument or other places.
Adding a comment to the config is also informative.

Yes, only an exact match and those matched with trailing numbers are considered here. I will update the information in the argument definition and configs.

…n ChunkIterFactory

Emrys365 added 2 commits February 9, 2023 16:59

Update chunk iterator for the TSE task; Update EnhPreprocessor

109e92d

Update config files in TSE recipes

4fd464a

Emrys365 added ESPnet2 SE Speech enhancement labels Feb 9, 2023

Emrys365 added 2 commits February 9, 2023 21:06

Fix the shape mismatch issue in EnhPreprocessor

352cc31

Fix subset size issue in egs2/mini_an4/tse1

f2d16aa

Emrys365 force-pushed the tse branch from f34c758 to f2d16aa Compare February 9, 2023 15:33

Emrys365 added 2 commits February 16, 2023 14:48

Update ChunkIterFactory

1429d3e

Update ChunkIterFactory

e41f1da

sw005320 reviewed Feb 17, 2023

View reviewed changes

Emrys365 added 2 commits February 18, 2023 19:18

Update information about the argument 'chunk_excluded_key_prefixes' i…

3962326

…n ChunkIterFactory

Show excluded_key_pattern in logs

1ee58f6

sw005320 added this to the v.202303 milestone Mar 3, 2023

sw005320 added the Refactoring Refactoring label Mar 3, 2023

Merge branch 'master' into tse

533b837

sw005320 added the auto-merge Enable auto-merge label Mar 3, 2023

mergify bot merged commit a4aeeb2 into espnet:master Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the chunk iterator for the TSE task#4929

Update the chunk iterator for the TSE task#4929
mergify[bot] merged 9 commits intoespnet:masterfrom
Emrys365:tse

Emrys365 commented Feb 9, 2023 •

edited

Loading

Uh oh!

codecov bot commented Feb 9, 2023 •

edited

Loading

Uh oh!

sw005320 Feb 17, 2023

Uh oh!

Emrys365 Feb 18, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Emrys365 commented Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sw005320 Feb 17, 2023

Choose a reason for hiding this comment

Uh oh!

Emrys365 Feb 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Emrys365 commented Feb 9, 2023 •

edited

Loading

codecov bot commented Feb 9, 2023 •

edited

Loading

Emrys365 Feb 18, 2023 •

edited

Loading