Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files#5084
Conversation
…ate CTC predictions to file during ASR inference
for more information, see https://pre-commit.ci
|
Thanks! |
| token_list = self.beam_search.token_list | ||
|
|
||
| for layer_idx, encoder_out in intermediate_outs: | ||
| y = self.asr_model.ctc.argmax(encoder_out)[0] # batch_size = 1 |
There was a problem hiding this comment.
The intermediate CTC only supports greedy decoding? I think it's fine.
There was a problem hiding this comment.
Yes. I think greedy decoding can probably cover most use cases
|
I think it is good as long as this new feature does not affect the default behavior without interCTC. |
Codecov Report
@@ Coverage Diff @@
## master #5084 +/- ##
==========================================
- Coverage 75.88% 75.86% -0.03%
==========================================
Files 615 615
Lines 54767 54824 +57
==========================================
+ Hits 41559 41591 +32
- Misses 13208 13233 +25
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 8 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
|
Thanks a lot! |
Add InterCTC with optional self-conditioning to E-Branchformer encoder
This was originally written by @wanchichen.
The implementation is nearly the same as InterCTC in
TransformerEncoderandConformerEncoder.Add the ability to write predictions of InterCTC layer(s) to file during inference
The output is saved at
path/to/asr_decode_dir/test_set/logdir/output.*/1best_recog/encoder_interctc_layer<layer_idx>.txt.<layer_idx>corresponds to the indices set in model config file.For example, if the config contains this line:
interctc_layer_idx: [6,12], then there will two files corresponding the output of the 6th and the 12th encoder layer:encoder_interctc_layer6.txtandencoder_interctc_layer12.txt.Misc
The code in this PR was used in many of my Aphasia speech detection experiments and produced good results. I also added more tests to make sure ASR without InterCTC still works.
@sw005320 I mentioned there was a bug I encountered in my experiments in our last meeting, but looks like it has already been patched.