Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@alexsherstinsky
Copy link
Collaborator

@alexsherstinsky alexsherstinsky commented Sep 21, 2023

Description

This contribution provides an implementation of LLM.merge_and_unload(), which is triggered from LudwigModel.train(), based on configuration. The "merge and unload" behavior merges the fine-tuned LoRA weights into the base model so that the users can load one complete model (e.g., from HuggingFace) in a single AutoModelForCausalLM.from_pretrained() call, rather than using two calls (first loading the base model with AutoModelForCausalLM.from_pretrained() and then loading the fine-tuned weights with PeftModelForCausalLM.from_pretrained()). This capability facilitates portability of the inference function between Ludwig (LudwigModel.load() followed by model.predict() for inference) and others (e.g., HuggingFace using AutoModelForCausalLM.from_pretrained() followed by transformers.pipeline() for inference).

The configuration consists of extending the adapter section with the optional postprocessor section as follows:

adapter:
  type: lora
  postprocessor:
    merge_adapter_into_base_model: true
    progressbar: true

(If the merge_adapter_into_base_model is kept and set to false, then the progressbar directive can be omitted.)

Code Pull Requests

Please provide the following:

  • a clear explanation of what your code does
  • if applicable, a reference to an issue
  • a reproducible test for your PR (code, config and data sample)

Documentation Pull Requests

Note that the documentation HTML files are in docs/ while the Markdown sources are in mkdocs/docs.

If you are proposing a modification to the documentation you should change only the Markdown files.

api.md is automatically generated from the docstrings in the code, so if you want to change something in that file, first modify ludwig/api.py docstring, then run mkdocs/code_docs_autogen.py, which will create mkdocs/docs/api.md .

…to be able to use the base model as the standalone model, rather than having to first load the base model and then loading the fine-tuned weights.
@arnavgarg1 arnavgarg1 self-requested a review September 21, 2023 21:17
@github-actions
Copy link

github-actions bot commented Sep 21, 2023

Unit Test Results

       6 files  ±0         6 suites  ±0   58m 24s ⏱️ + 4m 34s
2 807 tests ±0  2 782 ✔️  - 11  23 💤 +11  2 ±0 
2 847 runs  ±0  2 809 ✔️  - 15  36 💤 +15  2 ±0 

For more details on these failures, see this check.

Results for commit c4f2185. ± Comparison against base commit 1286123.

♻️ This comment has been updated with latest results.

@arnavgarg1
Copy link
Contributor

Hi @alexsherstinsky! Great work on this PR and thanks for your contribution - I will try and review by EOW!

…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
@alexsherstinsky
Copy link
Collaborator Author

Hi @alexsherstinsky! Great work on this PR and thanks for your contribution - I will try and review by EOW!

Hi @arnavgarg1! Thank you so much for your help and support. Could you please wait a little bit -- I want to try to add another test, specific to this particular new feature -- I will let you know once it is ready (hopefully later today). Thanks a lot!

…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
Copy link
Collaborator

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is an awesome addition, and I think the way it's integrated into the config is quite clean.

…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
Comment on lines 631 to 800
# For a full explanation of this 8-bit workaround, see https://github.com/ludwig-ai/ludwig/pull/3606
def filter_for_weight_format(i):
"""Remove bitsandbytes metadata keys added on state dict creation.
# def filter_for_weight_format(i):
# """Remove bitsandbytes metadata keys added on state dict creation.
#
# 8-bit quantized models that have been put on gpu will have a set of `weight_format` keys in their state dict.
# These contain strings that are used to reshape quantized tensors, however these have no impact until the state
# dict is loaded into a model. These keys were causing `torch.equal` to raise an exception, so we skip them in
# the evaluation.
# """
# return "weight_format" not in i[0]

# model_1_filtered_state_dict = filter(filter_for_weight_format, model_1.state_dict().items())
# model_2_filtered_state_dict = filter(filter_for_weight_format, model_2.state_dict().items())

8-bit quantized models that have been put on gpu will have a set of `weight_format` keys in their state dict.
These contain strings that are used to reshape quantized tensors, however these have no impact until the state
dict is loaded into a model. These keys were causing `torch.equal` to raise an exception, so we skip them in the
evaluation.
"""
return "weight_format" not in i[0]
# Source: https://discuss.pytorch.org/t/check-if-models-have-same-weights/4351/6

model_1_filtered_state_dict = filter(filter_for_weight_format, model_1.state_dict().items())
model_2_filtered_state_dict = filter(filter_for_weight_format, model_2.state_dict().items())
if model_1.__class__.__name__ != model_2.__class__.__name__:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with commenting out some of these lines of code for now since we still don't have GPU tests setup for Ludwig (currently a work in progress), but this check for filtering the state dict for weight formats with 8 bit quantization is necessary to make sure the tests for comparing models works correctly when we test fine-tuning with 8-bit quantization!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe create a comment with a TODO here to re-enable those lines of code when GPU tests are enabled? That should be fine for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnavgarg1 The issue is that the filter_for_weight_format() method (which focuses on 8-bit quantization) is not currently used, and it was causing linter errors. Could you please suggest what we should do? Should we keep it commented out for now (I added a TODO in the updated PR), or try to re-enable it and use it? Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be okay to keep it commented out for now since we have the TODO - maybe I can take a look at that in a follow-up PR, but for now I don't want to block merging this awesome change into Ludwig because of it since it's causing no harm!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @jeffkinnison just a quick FYI so you're not surprised by this change in the test. I will work on adding it back when GPU tests get enabled

Copy link
Contributor

@arnavgarg1 arnavgarg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work on this! I especially love the way you've written out your tests for making sure it is working as expected - thanks for being so thorough and for this awesome contribution!

I left some minor comments, but none of them are blocking merging this PR!

Copy link
Contributor

@arnavgarg1 arnavgarg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work on this! I especially love the way you've written out your tests for making sure it is working as expected - thanks for being so thorough and for this awesome contribution!

I left some minor comments, but none of them are blocking merging this PR!

…prove language of raised exceptions for LoRA model usage (based on PR feedback).
…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
…ing/support_merging_lora_weights_into_base_model-2023_09_13-0
@arnavgarg1 arnavgarg1 merged commit 3dc8f4b into ludwig-ai:master Sep 27, 2023
@alexsherstinsky alexsherstinsky deleted the feature/ISSUE-3603/alexsherstinsky/finetuning/support_merging_lora_weights_into_base_model-2023_09_13-0 branch September 27, 2023 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants