-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fix: Respect inference_mode when setting adapters with modules_to_save (Issue #2928)
#2931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… in ModulesToSaveWrapper Remove the lines that set requires_grad on original_module in ModulesToSaveWrapper.enable_adapters() method. This change addresses the maintainer's feedback that there is no reason to touch the requires_grad of the original_module here, and it conflicts with bitsandbytes quantization which requires gradients to be False at all times. The original_module's requires_grad is no longer manipulated by enable_adapters(), only modules_to_save gradients are managed. Updated test_requires_grad_modules_to_save_disabling to reflect this change by removing expectations about original_module having gradients when adapters are disabled. Related to issue huggingface#2928 and PR huggingface#2931.
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
Still relevant, please don’t mark as stale/close. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I finally got around to revisit this topic. Please check the comments I added. Moreover, could you please merge with/rebase on the latest main branch?
src/peft/utils/other.py
Outdated
| # if the adapter is found in this module, set it as the active adapter, else disable the adapters of this | ||
| # module | ||
| if adapter_name_to_set in module._adapters: | ||
| if not inference_mode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using enable_adapters here is not the right way. We already pass inference_mode to the module.set_adapter call, it should be implemented there.
tests/test_other.py
Outdated
| assert expected == modules | ||
|
|
||
|
|
||
| class TestModulesToSaveInferenceMode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding this new test class, let's add the new tests to existing tests for requires_grad, which you already modified above. Just for reference, I mean this test class:
peft/tests/test_custom_models.py
Line 3892 in 261366d
| class TestRequiresGrad: |
If I'm not mistaken, the following tests should cover what we need:
def test_requires_grad_follows_inference_mode_modules_to_save(self):
# check that passing inference_mode to set_adapter has the intended effect with LoRA and modules_to_save
config0 = LoraConfig(target_modules=["lin0"], modules_to_save=["lin1"])
peft_model = get_peft_model(MLP(), config0)
config1 = LoraConfig(target_modules=["lin0"], modules_to_save=["lin1"])
peft_model.add_adapter("adapter1", config1)
# active adapter is still "default"
self.check_requires_grad(
peft_model,
"base_model.model.lin1.modules_to_save.default.weight",
"base_model.model.lin1.modules_to_save.default.bias",
"base_model.model.lin0.lora_A.default.weight",
"base_model.model.lin0.lora_B.default.weight",
)
# inference mode false (default)
# set config0 as active, should not change anything
peft_model.set_adapter("default", inference_mode=False)
self.check_requires_grad(
peft_model,
"base_model.model.lin1.modules_to_save.default.weight",
"base_model.model.lin1.modules_to_save.default.bias",
"base_model.model.lin0.lora_A.default.weight",
"base_model.model.lin0.lora_B.default.weight",
)
# set config1 as active, should lead to adapter1 requiring grad
peft_model.set_adapter("adapter1", inference_mode=False)
self.check_requires_grad(
peft_model,
"base_model.model.lin1.modules_to_save.adapter1.weight",
"base_model.model.lin1.modules_to_save.adapter1.bias",
"base_model.model.lin0.lora_A.adapter1.weight",
"base_model.model.lin0.lora_B.adapter1.weight",
)
# inference mode true
# set config0 as active but in inference mode, should result in no module requiring grad
peft_model.set_adapter("default", inference_mode=True)
self.check_requires_grad(peft_model)
# set config1 as active but in inference mode, should result in no module requiring grad
peft_model.set_adapter("adapter1", inference_mode=True)
self.check_requires_grad(peft_model)
def test_requires_grad_follows_inference_mode_trainable_token_indices(self):
# check that passing inference_mode to set_adapter has the intended effect with LoRA and trainable tokens
config0 = LoraConfig(target_modules=["conv1d"], trainable_token_indices={"emb": [0, 1, 2]})
peft_model = get_peft_model(ModelEmbConv1D(), config0)
config1 = LoraConfig(target_modules=["lin0"], trainable_token_indices={"emb": [0, 1, 2]})
peft_model.add_adapter("adapter1", config1)
# active adapter is still "default"
self.check_requires_grad(
peft_model,
"base_model.model.emb.token_adapter.trainable_tokens_delta.default",
"base_model.model.conv1d.lora_A.default.weight",
"base_model.model.conv1d.lora_B.default.weight",
)
# inference mode false (default)
# set config0 as active, should not change anything
peft_model.set_adapter("default", inference_mode=False)
self.check_requires_grad(
peft_model,
"base_model.model.emb.token_adapter.trainable_tokens_delta.default",
"base_model.model.conv1d.lora_A.default.weight",
"base_model.model.conv1d.lora_B.default.weight",
)
# set config1 as active, should lead to adapter1 requiring grad
peft_model.set_adapter("adapter1", inference_mode=False)
self.check_requires_grad(
peft_model,
"base_model.model.emb.token_adapter.trainable_tokens_delta.adapter1",
"base_model.model.lin0.lora_A.adapter1.weight",
"base_model.model.lin0.lora_B.adapter1.weight",
)
# inference mode true
# set config0 as active but in inference mode, should result in no module requiring grad
peft_model.set_adapter("default", inference_mode=True)
self.check_requires_grad(peft_model)
# set config1 as active but in inference mode, should result in no module requiring grad
peft_model.set_adapter("adapter1", inference_mode=True)
self.check_requires_grad(peft_model)Please double check if this makes sense to you.
tests/test_custom_models.py
Outdated
| peft_model = get_peft_model(MLP(), config) | ||
|
|
||
| # no layer should have requires_grad | ||
| # when disabling the adapter, modules_to_save should have requires_grad=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes to this test can be reverted, right?
5fdebb3 to
d49985e
Compare
Refactor _set_adapter and ModulesToSaveWrapper.set_adapter to address maintainer feedback. The enable_adapters calls should not be conditional in _set_adapter; instead, the inference_mode handling should be implemented entirely within the set_adapter method. Changes: - Remove conditional enable_adapters(True/False) calls from _set_adapter function based on inference_mode parameter - Move enable_adapters logic into ModulesToSaveWrapper.set_adapter method to handle inference_mode internally - Call enable_adapters(not inference_mode) within set_adapter to ensure adapters are enabled/disabled correctly based on inference_mode - Update set_adapter to handle both empty adapter list case and normal adapter setting case with proper enable_adapters calls This refactoring ensures that inference_mode is handled entirely within the set_adapter method implementation, as requested by the maintainer, rather than conditionally calling enable_adapters in the _set_adapter helper function. Addresses maintainer feedback in PR huggingface#2931.
Remove the TestModulesToSaveInferenceMode test class from test_other.py as requested by the maintainer. The tests for inference_mode behaviour with modules_to_save should be integrated into the existing TestRequiresGrad class in test_custom_models.py instead of having a separate test class. Changes: - Remove entire TestModulesToSaveInferenceMode class including: - test_modules_to_save_inference_mode_requires_grad_false - test_modules_to_save_training_mode_requires_grad_true - test_modules_to_save_inference_mode_with_torch_inference_mode - Tests will be moved to TestRequiresGrad class in test_custom_models.py following the maintainer's specified test structure This change addresses maintainer feedback in PR huggingface#2931 to consolidate inference_mode tests into the existing requires_grad test suite.
Add new tests for inference_mode behaviour and revert changes to test_requires_grad_modules_to_save_disabling as requested by the maintainer. Changes: - Add test_requires_grad_follows_inference_mode_modules_to_save to TestRequiresGrad class to verify that passing inference_mode to set_adapter has the intended effect with LoRA and modules_to_save - Add test_requires_grad_follows_inference_mode_trainable_token_indices to TestRequiresGrad class to verify that passing inference_mode to set_adapter has the intended effect with LoRA and trainable tokens - Revert test_requires_grad_modules_to_save_disabling to original version that checks for original_module.weight and original_module.bias having requires_grad=True when adapters are disabled The new tests follow the maintainer's specified structure and verify: - inference_mode=False (default) maintains requires_grad=True for active adapters and modules_to_save - inference_mode=True results in no modules requiring gradients - Tests cover both modules_to_save and trainable_token_indices scenarios This addresses maintainer feedback in PR huggingface#2931 to integrate inference_mode tests into the existing TestRequiresGrad class and restore the original test expectations for modules_to_save disabling behaviour.
Add optional inference_mode parameter to PeftModel.set_adapter() method to allow setting adapters in frozen state (requires_grad=False) directly without manual parameter manipulation. Changes: - Add inference_mode parameter with default value False to maintain backwards compatibility - Update method docstring to document the new parameter and clarify that adapters are set to trainable unless inference_mode is True - Remove manual example code snippet showing how to set requires_grad=False - Pass inference_mode parameter to base_model.set_adapter() and _set_adapter() helper function calls This enhancement simplifies the workflow for users who want to set adapters in inference mode, addressing the need to manually manipulate requires_grad flags after setting an adapter.
Add comprehensive test suite to validate that modules_to_save correctly respect the inference_mode parameter when set_adapter is called. This test class addresses issue huggingface#2928 where modules_to_save had requires_grad=True even when inference_mode=True was passed to set_adapter. Test coverage: - test_modules_to_save_inference_mode_requires_grad_false: Verifies that modules_to_save parameters have requires_grad=False when inference_mode=True is passed to set_adapter, ensuring parameters are frozen during inference - test_modules_to_save_training_mode_requires_grad_true: Verifies that modules_to_save parameters have requires_grad=True when inference_mode=False is passed to set_adapter, ensuring parameters are trainable during training - test_modules_to_save_inference_mode_with_torch_inference_mode: Validates that modules_to_save work correctly when used with torch.inference_mode() context manager and that forward passes still function correctly All tests use AutoModelForSequenceClassification with LoRA configuration targeting query and value modules, with classifier as modules_to_save to provide realistic test scenarios.
Reformat the docstring comment in TestModulesToSaveInferenceMode class to fit within line length limits by combining two lines into a single line. This is a minor formatting change to improve code readability and compliance with project style guidelines.
… in ModulesToSaveWrapper Remove the lines that set requires_grad on original_module in ModulesToSaveWrapper.enable_adapters() method. This change addresses the maintainer's feedback that there is no reason to touch the requires_grad of the original_module here, and it conflicts with bitsandbytes quantization which requires gradients to be False at all times. The original_module's requires_grad is no longer manipulated by enable_adapters(), only modules_to_save gradients are managed. Updated test_requires_grad_modules_to_save_disabling to reflect this change by removing expectations about original_module having gradients when adapters are disabled. Related to issue huggingface#2928 and PR huggingface#2931.
Refactor _set_adapter and ModulesToSaveWrapper.set_adapter to address maintainer feedback. The enable_adapters calls should not be conditional in _set_adapter; instead, the inference_mode handling should be implemented entirely within the set_adapter method. Changes: - Remove conditional enable_adapters(True/False) calls from _set_adapter function based on inference_mode parameter - Move enable_adapters logic into ModulesToSaveWrapper.set_adapter method to handle inference_mode internally - Call enable_adapters(not inference_mode) within set_adapter to ensure adapters are enabled/disabled correctly based on inference_mode - Update set_adapter to handle both empty adapter list case and normal adapter setting case with proper enable_adapters calls This refactoring ensures that inference_mode is handled entirely within the set_adapter method implementation, as requested by the maintainer, rather than conditionally calling enable_adapters in the _set_adapter helper function. Addresses maintainer feedback in PR huggingface#2931.
Remove the TestModulesToSaveInferenceMode test class from test_other.py as requested by the maintainer. The tests for inference_mode behaviour with modules_to_save should be integrated into the existing TestRequiresGrad class in test_custom_models.py instead of having a separate test class. Changes: - Remove entire TestModulesToSaveInferenceMode class including: - test_modules_to_save_inference_mode_requires_grad_false - test_modules_to_save_training_mode_requires_grad_true - test_modules_to_save_inference_mode_with_torch_inference_mode - Tests will be moved to TestRequiresGrad class in test_custom_models.py following the maintainer's specified test structure This change addresses maintainer feedback in PR huggingface#2931 to consolidate inference_mode tests into the existing requires_grad test suite.
Add new tests for inference_mode behaviour and revert changes to test_requires_grad_modules_to_save_disabling as requested by the maintainer. Changes: - Add test_requires_grad_follows_inference_mode_modules_to_save to TestRequiresGrad class to verify that passing inference_mode to set_adapter has the intended effect with LoRA and modules_to_save - Add test_requires_grad_follows_inference_mode_trainable_token_indices to TestRequiresGrad class to verify that passing inference_mode to set_adapter has the intended effect with LoRA and trainable tokens - Revert test_requires_grad_modules_to_save_disabling to original version that checks for original_module.weight and original_module.bias having requires_grad=True when adapters are disabled The new tests follow the maintainer's specified structure and verify: - inference_mode=False (default) maintains requires_grad=True for active adapters and modules_to_save - inference_mode=True results in no modules requiring gradients - Tests cover both modules_to_save and trainable_token_indices scenarios This addresses maintainer feedback in PR huggingface#2931 to integrate inference_mode tests into the existing TestRequiresGrad class and restore the original test expectations for modules_to_save disabling behaviour.
0b46c6d to
8d61c1b
Compare
Thank you for the detailed feedback! I've addressed all your comments: Changes Made:
Please let me know if there's anything else that needs to be adjusted! |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR, but I think it's overcomplicating things. I flagged the parts of the code that I think need changing.
src/peft/utils/other.py
Outdated
| # when calling model.add_adapter, the new adapter is not automatically active | ||
| self._active_adapter = [] | ||
| # enable/disable adapters based on inference_mode | ||
| self.enable_adapters(not inference_mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my understanding is correct, calling enable_adapters and requires_grad_(False) here and below should not be necessary. It should be enough to pass inference_mode to set_adapter above. Did you find a situation where making these calls was needed?
Remove redundant enable_adapters calls in ModulesToSaveWrapper.set_adapter() method. The maintainer correctly identified that passing inference_mode to set_adapter is sufficient, as the method already handles setting requires_grad correctly via self.modules_to_save[adapter_name].requires_grad_(not inference_mode). The enable_adapters calls were redundant and potentially causing issues.
Restore the test_requires_grad_modules_to_save_disabling test to check that when adapters are disabled, no parameters should have requires_grad=True, matching the intended behaviour from commit 3ea4e67. This addresses the maintainer's concern about an incorrect merge conflict resolution that reverted these test changes.
Hi @BenjaminBossan, Thanks for the review. I've addressed both points:
The code is simpler and the tests match the intended behaviour. Please let me know if anything else needs adjustment. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Just a small issue left.
tests/test_custom_models.py
Outdated
| peft_model = get_peft_model(MLP(), config) | ||
|
|
||
| # no layer should have requires_grad | ||
| # when disabling the adapter, modules_to_save should have requires_grad=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please revert the change completely, including the comment and the formatting?
Revert `test_requires_grad_modules_to_save_disabling` to the version on `upstream/main`, restoring the original comments and formatting requested by the reviewer. This aligns the test semantics and style with the existing suite and avoids over-explaining implementation details in the docstrings.
Done! |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates.
As you can see, the CI is currently failing. The reason is the AdaptionPromptModel, which I honestly didn't think about when reviewing this PR. It also has a set_adapter method which requires adding the new argument.
As for what to do with this argument: I'd say it's not really that important. We can just say: if inference mode: raise ValueError(" ... is not supported").
Add inference_mode parameter to AdaptionPromptModel.set_adapter() method to match the API signature expected by PeftModel.set_adapter(). When inference_mode is True, raise a ValueError as this mode is not supported for AdaptionPromptModel. This fixes the CI failure where PeftModel.set_adapter() was calling base_model.set_adapter() with inference_mode parameter, but AdaptionPromptModel.set_adapter() did not accept this parameter. Fixes issue reported by maintainer in PR huggingface#2931 review comments.
It should be fine now! |
Fix: Respect
inference_modewhen setting adapters withmodules_to_saveFixes #2928
Description
This PR fixes issue #2928 where
modules_to_savehadrequires_grad=Trueeven wheninference_mode=Truewas passed toset_adapter()caused issues when using quantized models (e.g., with bitsandbytes) in inference mode, as the quantized layers require parameters to haverequires_grad=False.Problem
When calling
model.set_adapter(adapter_name, inference_mode=True)with a model that hasmodules_to_saveconfigured (e.g., a classifichead), themodules_to_saveparameters would still haverequires_grad=Truedespite being in inference mode. This happened because:_set_adapter()was callingmodule.enable_adapters(True)unconditionallyModulesToSaveWrapper.enable_adapters(True)setsrequires_grad_(True)for all active adaptersset_adapter()was called withinference_mode, creating a conflictSolution
The fix ensures that
enable_adapters()is only called when not in inference mode:Modified
_set_adapter()insrc/peft/utils/other.py:enable_adapters(True)wheninference_mode=FalseTruewhen inference mode should keep themFalseUpdated
PeftModel.set_adapter()insrc/peft/peft_model.py:inference_modeparameter to match the API ofPeftMixedModel.set_adapter()inference_modeto both_set_adapter()andbase_model.set_adapter()Added comprehensive tests in
tests/test_other.py:test_modules_to_save_inference_mode_requires_grad_false: Verifiesrequires_grad=Falsein inference modetest_modules_to_save_training_mode_requires_grad_true: Verifiesrequires_grad=Truein training modetest_modules_to_save_inference_mode_with_torch_inference_mode: Verifies compatibility withtorch.inference_mode()Changes Made
Code Changes
src/peft/utils/other.py:_set_adapter()to conditionally callenable_adapters()based oninference_modeparametersrc/peft/peft_model.py:set_adapter()method signature to acceptinference_modeparameterinference_modeto underlying adapter setting functionstests/test_other.py:TestModulesToSaveInferenceModetest class with 3 comprehensive testsTesting
Test Results
All new tests pass:
test_modules_to_save_inference_mode_requires_grad_false- PASSEDtest_modules_to_save_training_mode_requires_grad_true- PASSEDtest_modules_to_save_inference_mode_with_torch_inference_mode- PASSEDAll existing
modules_to_savetests pass (11/11)Related tests pass (71/74 - 3 failures are unrelated BOFT dependency issues)
Code quality checks pass (
make quality)Test Coverage
The new tests verify:
modules_to_saveparameters correctly haverequires_grad=Falsewheninference_mode=Truemodules_to_saveparameters correctly haverequires_grad=Truewheninference_mode=False(training mode)torch.inference_mode()context managerExample Usage
Before this fix, the following code would fail with quantized models:
After this fix: