Codestin Search App

Miamoto · 2026-01-16T14:24:10Z

What did you change?

•	Added the implementation from "Improving Cross-Attention based on Positional Alignment during Inference for Robust Long-form Speech Recognition"
•	Introduced new decodingoption to enable inference trick: "inference_lf_trick: True" can be added in decode_asr.yaml file.

Why did you make this change?

Is your PR small enough?

yes

Additional Context

…lignment during Inference

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request implements an inference trick from the paper "Improving Cross-Attention based on Positional Alignment during Inference for Robust Long-form Speech Recognition". The changes introduce a new decoding option and modify several components to apply a Gaussian bias to cross-attention scores during inference. My review identified a critical issue where the new feature would be silently ignored when using optimized attention mechanisms like Flash Attention. I have also pointed out a minor type hint inconsistency. Overall, the implementation of the core logic appears sound, but the interaction with existing optimizations needs to be addressed to ensure the feature works correctly in all configurations.

espnet2/bin/asr_inference.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

sw005320 · 2026-01-29T20:45:06Z

Can you fix the CI error?
Can you come up with a better name for inference_lf_trick?
For the CTC prefix score, margin is prepared in CTCPrefixScorer, but it cannot be configurable via the current option argument. So, how about using this?

inference trick from: Improving Cross-Attention based on Positional A…

1e9860c

…lignment during Inference

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. ASR Automatic speech recogntion ESPnet2 labels Jan 16, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

7cd4403

for more information, see https://pre-commit.ci

gemini-code-assist bot reviewed Jan 16, 2026

View reviewed changes

espnet2/bin/asr_inference.py Outdated Show resolved Hide resolved

Miamoto and others added 2 commits January 16, 2026 17:51

Update espnet2/bin/asr_inference.py

7731448

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Merge branch 'master' into inference_trick

22f94d8

Masao-Someki mentioned this pull request Jan 19, 2026

[espnet3-16] Add demo stage #6342

Open

Merge branch 'master' into inference_trick

a0098ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

inference trick from: "Improving Cross-Attention based on Positional Alignment during Inference for Robust Long-form Speech Recognition"#6339

inference trick from: "Improving Cross-Attention based on Positional Alignment during Inference for Robust Long-form Speech Recognition"#6339
Miamoto wants to merge 5 commits intoespnet:masterfrom
Miamoto:inference_trick

Miamoto commented Jan 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

sw005320 commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Miamoto commented Jan 16, 2026

What did you change?

Why did you make this change?

Is your PR small enough?

Additional Context

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

sw005320 commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants