Codestin Search App

Masao-Someki · 2025-09-26T06:06:21Z

What did you change?

Added a new module: espnet3/trainer/callbacks.py, which includes:
- AverageCheckpointsCallback: a custom Lightning callback for averaging top-K model checkpoints.
- get_default_callbacks(): a utility to create a standard set of callbacks including checkpointing, progress bar, and LR monitoring.
Integrated get_default_callbacks into the training loop in espnet3/trainer/trainer.py.
Created a new test module test/espnet3/test_callback.py:

Why did you make this change?

To support checkpoint ensembling by averaging top-K models.
To standardize callback configuration in the trainer, ensuring consistency and reusability.
To ensure correctness and robustness of the new logic via a comprehensive test suite.

Is your PR small enough?

Yes. 3 files changed, ~500 additions.

Additional Context

Masao-Someki · 2025-09-26T06:06:56Z

Details on test cases is listed here: #6180 (comment)

gemini-code-assist

Code Review

This pull request introduces new callbacks for PyTorch Lightning, including a callback to average the best K checkpoints. The implementation is mostly solid, and it comes with a comprehensive test suite.

I've found a few issues that should be addressed:

In espnet3/trainer/callbacks.py, there's a potential TypeError if no checkpoints are available for averaging. Also, the data type check for averaging is a bit fragile and could be made more robust.
In espnet3/trainer/trainer.py, a change from _del_config_key to pop could cause a regression when using argparse.Namespace for configuration.

Additionally, it seems that the new callbacks are not actually passed to the lightning.Trainer instance in espnet3/trainer/trainer.py, as the callbacks argument is still commented out. This would prevent the new functionality from working.

espnet3/trainer/callbacks.py

espnet3/trainer/trainer.py

gemini-code-assist

Code Review

This pull request introduces new callback functionality for checkpoint averaging and standardizes callback creation. My review focuses on improving security, robustness, and correctness. I've identified a critical security vulnerability in how checkpoints are loaded, a potential crash when no checkpoints are available for averaging, and a bug in configuration handling that could affect different configuration object types. I've provided suggestions to fix these issues and also recommended adding a test case for an important edge case.

espnet3/trainer/callbacks.py

espnet3/trainer/trainer.py

sw005320 · 2025-09-26T10:27:37Z

I think this is a great idea to implement an average checkpoints (or other monitoring values) via callback.

@Emrys365, can you check this PR?

Emrys365

LGTM in general.

espnet3/trainer/callbacks.py

Emrys365 · 2025-09-26T11:08:14Z

espnet3/trainer/trainer.py

+            for callback in self.config.callbacks:
+                callbacks.append(instantiate(callback))


Can it happen that the defined callbacks in self.config are duplicates of the default callbacks? In that case, would it better to detect such cases to display warnings?

Thank you, the current implementation does not account for cases where callbacks defined in get_default_callbacks overlap with those specified in the config. For example, LearningRateMonitor may be registered twice, which can result in duplicated log outputs..

In the current behavior of Lightning this is not a critical issue, so it does not immediately cause an error.

sw005320 · 2025-09-26T15:33:12Z

espnet3/trainer/callbacks.py

+class AverageCheckpointsCallback(Callback):
+    """
+    A custom PyTorch Lightning callback that performs weight averaging over the top-K
+    checkpoints (according to specified metrics) at the end of training.


Does model averaging only happen at the end of training?
This would be an issue when people try to get the intermediate results with the checkpoint.

Thank you! Right now it's only hooked into on_fit_end, so it fires once at the very end. I will switch to on_validation_end to trigge after every validation round.

codecov · 2025-09-26T15:43:07Z

Codecov Report

❌ Patch coverage is 91.66667% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.98%. Comparing base (f5f1fd1) to head (2ae769e).
⚠️ Report is 13 commits behind head on espnet3.

Files with missing lines	Patch %	Lines
espnet3/trainer/trainer.py	50.00%	4 Missing ⚠️
espnet3/trainer/callbacks.py	98.07%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           espnet3    #6249      +/-   ##
===========================================
+ Coverage    68.96%   68.98%   +0.01%     
===========================================
  Files          750      751       +1     
  Lines        68915    68974      +59     
===========================================
+ Hits         47530    47584      +54     
- Misses       21385    21390       +5

Flag	Coverage Δ
test_integration_espnet2	`47.23% <ø> (ø)`
test_python_espnet2	`61.95% <0.00%> (-0.06%)`	⬇️
test_python_espnet3	`16.00% <91.66%> (+0.06%)`	⬆️
test_utils	`61.95% <0.00%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Masao-Someki added 6 commits September 26, 2025 01:02

Added callbacks

f12b407

Applied black and isort

6f52894

Update docstring

492f916

Add key-mismatch check

1782d0f

Add unit tests

0a94123

Apply format

ac3dded

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. ESPnet3 New Features labels Sep 26, 2025

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

espnet3/trainer/callbacks.py Show resolved Hide resolved

espnet3/trainer/callbacks.py Show resolved Hide resolved

espnet3/trainer/trainer.py Outdated Show resolved Hide resolved

Masao-Someki requested review from Emrys365 and sw005320 September 26, 2025 06:08

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

espnet3/trainer/callbacks.py Show resolved Hide resolved

espnet3/trainer/callbacks.py Show resolved Hide resolved

espnet3/trainer/trainer.py Outdated Show resolved Hide resolved

Emrys365 approved these changes Sep 26, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 26, 2025

Masao-Someki added 3 commits September 26, 2025 10:01

Applied Gemini review

3d25328

Removed comment out callback

3eff4b6

Formatting

6d1a4eb

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Sep 26, 2025

sw005320 reviewed Sep 26, 2025

View reviewed changes

Masao-Someki added 2 commits September 26, 2025 10:48

Change hook for averaging checkpoints

fdf27db

Replaced on_fit_end to on_validation_end

2ae769e

sw005320 approved these changes Sep 26, 2025

View reviewed changes

Fhrozen added this to the v.202512 milestone Sep 28, 2025

Masao-Someki merged commit 4531bcc into espnet:espnet3 Oct 2, 2025
98 of 131 checks passed

Masao-Someki mentioned this pull request Oct 9, 2025

Development plan for ESPnet-3 #6133

Open

52 tasks

Fhrozen modified the milestones: v.202512, v.202511 Nov 14, 2025

Masao-Someki mentioned this pull request Nov 17, 2025

[espnet3-10] Merge espnet3 branch into master #6304

Merged

Masao-Someki deleted the espnet3/callback-2 branch November 26, 2025 18:25

		for callback in self.config.callbacks:
		callbacks.append(instantiate(callback))

Conversation

Masao-Someki commented Sep 26, 2025

What did you change?

Why did you make this change?

Is your PR small enough?

Additional Context

Uh oh!

Masao-Someki commented Sep 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sw005320 commented Sep 26, 2025

Uh oh!

Emrys365 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Emrys365 Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Masao-Someki Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

sw005320 Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Masao-Someki Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

sw005320 Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Sep 26, 2025 •

edited

Loading