Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Rbrq03
Copy link
Contributor

@Rbrq03 Rbrq03 commented Mar 11, 2024

What does this PR do?

This PR fixes the loading of cross-attention weights in custom diffusion models when PEFT is installed. This bug has been discussed in issue #7261

Fixes #7261(issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul

Comment on lines 377 to 381
elif is_custom_diffusion:
#Load custom diffusion cross attention weight with PEFT installed in environment
self.set_attn_processor(attn_processors)

self.to(dtype=self.dtype, device=self.device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already doing this here no?

Perhaps, guarding this code block with if not is_custom_diffusion would be more helpful?

# set lora layers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.set_attn_processor(attn_processors)

will be executed only when people don't install PEFT. If PEFT is installed, the bug described in issue #7261 will appear, as
if not USE_PEFT_BACKEND:

will be false and skip the execution of
self.set_attn_processor(attn_processors)

I hope I give a clear response to your question. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks so much. In that case, could we maybe simply move

if is_custom_diffusion:

out of

if not USE_PEFT_BACKEND:

?

Copy link
Contributor Author

@Rbrq03 Rbrq03 Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried this solution, but it failed. The reason is

self.to(dtype=self.dtype, device=self.device)

will not be executed.

It will cause the bug "not same device" when you use GPU to train. (As the self is not been moved to GPU)

In addition I think

self.to(dtype=self.dtype, device=self.device)

can not be simply moved out of

if not USE_PEFT_BACKEND:

since when PEFT is uninstalled, it should between

if _pipeline is not None:
for _, component in _pipeline.components.items():
if isinstance(component, nn.Module) and hasattr(component, "_hf_hook"):
is_model_cpu_offload = isinstance(getattr(component, "_hf_hook"), CpuOffload)
is_sequential_cpu_offload = isinstance(getattr(component, "_hf_hook"), AlignDevicesHook)
logger.info(
"Accelerate hooks detected. Since you have called `load_lora_weights()`, the previous hooks will be first removed. Then the LoRA parameters will be loaded and the hooks will be applied again."
)
remove_hook_from_module(component, recurse=is_sequential_cpu_offload)

and

# Offload back.
if is_model_cpu_offload:
_pipeline.enable_model_cpu_offload()
elif is_sequential_cpu_offload:
_pipeline.enable_sequential_cpu_offload()

Hope to help:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I understand now.

I think the existing

if is_custom_diffusion:

needs to be deleted no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it feels redundant. Because why would we want to add add the same piece of code twice? 😱

Copy link
Contributor Author

@Rbrq03 Rbrq03 Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this question clear.

Our target is to let the weights of cross-attention in custom diffusion load successfully whether PEFT is installed since

  • PEFT is commonly installed for Lora
  • but people who want to use custom_diffusion only don't need PEFT

In my opinion, we need to keep both pieces of code for

  • PEFT offload UNet before

if not USE_PEFT_BACKEND:

so we don't need to offload UNet when PEFT is installed.

  • If we delete it. This code may result in unsafety when PEFT is not installed since UNet is not offloaded

if is_custom_diffusion:
self.set_attn_processor(attn_processors)

Or maybe we can change this piece of code to:

        if not USE_PEFT_BACKEND:
            if _pipeline is not None:
                for _, component in _pipeline.components.items():
                    if isinstance(component, nn.Module) and hasattr(component, "_hf_hook"):
                        is_model_cpu_offload = isinstance(getattr(component, "_hf_hook"), CpuOffload)
                        is_sequential_cpu_offload = isinstance(getattr(component, "_hf_hook"), AlignDevicesHook)

                        logger.info(
                            "Accelerate hooks detected. Since you have called `load_lora_weights()`, the previous hooks will be first removed. Then the LoRA parameters will be loaded and the hooks will be applied again."
                        )
                        remove_hook_from_module(component, recurse=is_sequential_cpu_offload)

        # only custom diffusion needs to set attn processors
        if is_custom_diffusion:
            #Load custom diffusion cross attention weight with PEFT installed in environment
            self.set_attn_processor(attn_processors)

        # set lora layers
        for target_module, lora_layer in lora_layers_list:
            target_module.set_lora_layer(lora_layer)

        self.to(dtype=self.dtype, device=self.device)
        
        if not USE_PEFT_BACKEND:
            # Offload back.
            if is_model_cpu_offload:
                _pipeline.enable_model_cpu_offload()
            elif is_sequential_cpu_offload:
                _pipeline.enable_sequential_cpu_offload()
            # Unsafe code />

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we leverage these two variables more systematically to process the logic you mentioned? is_custom_diffusion and USE_PEFT_BACKEND. I am uncomfortable keeping the same exact line of code like this as it makes the code confusing for us, maintainers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your proposal looks good to me.

Copy link
Contributor Author

@Rbrq03 Rbrq03 Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see that, I will update my PR. It's nice to discuss with you, your idea about code inspires me a lot.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner to me!

@sayakpaul sayakpaul requested a review from yiyixuxu March 12, 2024 03:27
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Rbrq03 Rbrq03 requested a review from sayakpaul March 13, 2024 06:28
@sayakpaul
Copy link
Member

LGTM! @yiyixuxu WDYT?

@Rbrq03
Copy link
Contributor Author

Rbrq03 commented Mar 18, 2024

Hey Saya, need a review for the new commit, I reformat this file @sayakpaul

@sayakpaul
Copy link
Member

It's Sayak not "Saya" 😅 Will trigger the CI now.

@Rbrq03
Copy link
Contributor Author

Rbrq03 commented Mar 18, 2024

I am sorry for that.😂

What happens to this commit? Should I do something to help it pass the CI?

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Apr 11, 2024
@yiyixuxu yiyixuxu removed the stale Issues that haven't received updates label Apr 11, 2024
@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label May 6, 2024
@yiyixuxu yiyixuxu removed the stale Issues that haven't received updates label May 8, 2024
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented May 8, 2024

can you update the quality dependency and then make style again?

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Issues that haven't received updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsuccessful cross-attention weight loading in Custom Diffusion

4 participants