Implement unified batch decode interface for OWSM-CTC#6007
Implement unified batch decode interface for OWSM-CTC#6007sw005320 merged 5 commits intoespnet:masterfrom
Conversation
for more information, see https://pre-commit.ci
|
This PR is ready. |
There was a problem hiding this comment.
why do you need this change?
There was a problem hiding this comment.
The previous code uses flash attention only during training. But we can also use it for inference.
There was a problem hiding this comment.
Can you add a test for this batch decoding?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6007 +/- ##
===========================================
- Coverage 47.49% 14.52% -32.97%
===========================================
Files 529 854 +325
Lines 47850 80268 +32418
===========================================
- Hits 22727 11660 -11067
- Misses 25123 68608 +43485
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
Thanks! |
Implement unified batch decode interface for OWSM-CTC
What?
decode_batchmethod can decode a batch of audios which can be either short-form or long-form. Each audio can be provided as a path, a numpy 1-D array or a torch 1-D tensor. This makes the usage more flexible.