Codestin Search App

whr-a · 2025-10-20T21:27:43Z

Fix 2 bugs:
- When calculating the residual in RVQ, detach() should be applied. Refer to Cisco's fix: core_vq.py, and Moshi's implementation: core_vq.py
- Bug in passing target_bandwidth during inference

gemini-code-assist

Code Review

This pull request introduces two important bug fixes for the codec codebase. The first fix correctly applies .detach() when calculating the residual in the Residual Vector Quantizer, which is crucial for stable training. The second fix enables passing target_bandwidth during inference.

The changes for both fixes are well-implemented. However, I've identified a potential issue in the inference methods of both DAC and SoundStream models. Passing **kwargs directly to the generator's encode method could cause a TypeError if unexpected arguments are provided. I've suggested a small refactoring to call the model's own encode method instead, which handles keyword arguments safely and improves code reuse. Overall, this is a good set of fixes.

gemini-code-assist · 2025-10-20T21:28:59Z

espnet2/gan_codec/dac/dac.py


        """
-        codec = self.generator.encode(x)
+        codec = self.generator.encode(x, **kwargs)


Calling self.generator.encode(x, **kwargs) directly is risky as it will raise a TypeError if kwargs contains any key other than target_bw. The inference method's signature allows for any keyword arguments, but DACGenerator.encode is more restrictive. It's safer to call self.encode(x, **kwargs), which correctly filters the keyword arguments, ensuring only target_bw is passed along. This also improves code reuse.

Suggested change

codec = self.generator.encode(x, **kwargs)

codec = self.encode(x, **kwargs)

gemini-code-assist · 2025-10-20T21:28:59Z

espnet2/gan_codec/soundstream/soundstream.py


        """
-        codec = self.generator.encode(x)
+        codec = self.generator.encode(x, **kwargs)


Calling self.generator.encode(x, **kwargs) directly can lead to a TypeError if kwargs includes keys not expected by SoundStreamGenerator.encode (which only accepts target_bw). To prevent potential crashes and improve code reuse, it's better to call self.encode(x, **kwargs). The self.encode method is designed to safely handle arbitrary keyword arguments by extracting only the relevant ones.

Suggested change

codec = self.generator.encode(x, **kwargs)

codec = self.encode(x, **kwargs)

for more information, see https://pre-commit.ci

codecov · 2025-10-20T21:45:52Z

Codecov Report

❌ Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.77%. Comparing base (333b6f7) to head (c5e3bdf).
⚠️ Report is 722 commits behind head on master.

Files with missing lines	Patch %	Lines
espnet2/gan_codec/dac/dac.py	0.00%	3 Missing ⚠️
espnet2/gan_codec/soundstream/soundstream.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #6268       +/-   ##
===========================================
+ Coverage   46.53%   56.77%   +10.24%     
===========================================
  Files         542      889      +347     
  Lines       49601    84363    +34762     
===========================================
+ Hits        23080    47899    +24819     
- Misses      26521    36464     +9943

Flag	Coverage Δ
test_integration_espnet2	`46.80% <37.50%> (+0.26%)`	⬆️
test_integration_espnetez	`36.92% <ø> (?)`
test_python_espnet2	`51.20% <25.00%> (?)`
test_python_espnetez	`12.81% <0.00%> (?)`
test_utils	`18.77% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sw005320 · 2025-10-20T22:07:14Z

Oh, this sounds critical.
Thanks for catching up on it.
@ftshijt, can you review this PR?

ftshijt · 2025-10-22T07:56:19Z

Thanks for your fixing! The fixes look great to me.

fix: quantizer detach and inference bandwidth

4df70cf

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. Bug bug should be fixed labels Oct 20, 2025

mergify bot added the ESPnet2 label Oct 20, 2025

gemini-code-assist bot reviewed Oct 20, 2025

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

c5e3bdf

for more information, see https://pre-commit.ci

sw005320 requested a review from ftshijt October 21, 2025 11:16

sw005320 added this to the v.202512 milestone Oct 21, 2025

ftshijt merged commit 7bbb72f into espnet:master Oct 22, 2025
32 checks passed

Fhrozen modified the milestones: v.202512, v.202511 Nov 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codec codebase bug fixes: `detach()` in RVQ residual and `target_bandwidth` in inference#6268

Codec codebase bug fixes: `detach()` in RVQ residual and `target_bandwidth` in inference#6268
ftshijt merged 2 commits intoespnet:masterfrom
whr-a:pr_codec

whr-a commented Oct 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

codecov bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

sw005320 commented Oct 20, 2025

Uh oh!

ftshijt commented Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	codec = self.generator.encode(x, **kwargs)
	codec = self.encode(x, **kwargs)

Conversation

whr-a commented Oct 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sw005320 commented Oct 20, 2025

Uh oh!

ftshijt commented Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Oct 20, 2025 •

edited

Loading