Refactor model offload #4514

patrickvonplaten · 2023-08-07T15:11:25Z

What does this PR do?

This PR is similar in spirit to #4114 .

Every pipeline can run enable_model_cpu_offload so this is a method we can move to PipelineModelMixin to remove some of the boilerplate code here.

Since every pipeline has a slightly different chain in which models should be on- and offloaded we need to add a class attribute that defines this chain of strings.

Also this PR adds a free_hooks method that should be called at the end of every Pipeline's call function. This method should be more robust than what we currently have and also solve bugs as the following: #2907

TODO:

add one powerful test to the general tester mixin
refactor all pipelines in this spirit similar to how it was done for Refactor execution device & cpu offload #4114

HuggingFaceDocBuilderDev · 2023-08-07T15:18:44Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/pipelines/pipeline_utils.py

patrickvonplaten · 2023-08-23T21:07:10Z

In this PR we should also nicely solve the following issue: #4435 (comment)

Simply because we will just offload all components in the maybe_free_model_hooks call.

@Kubuxu feel free to give this PR also a review

patrickvonplaten · 2023-08-30T07:04:21Z

@DN6 could you maybe try to take over this PR?

patrickvonplaten · 2023-09-04T07:21:48Z

Any progress here @DN6 ?

DN6 · 2023-09-04T16:50:59Z

@patrickvonplaten Handling it this week

Kubuxu

From the perspective of #4435 it solves it nicely.

DN6 · 2023-09-11T06:59:29Z

@patrickvonplaten This is ready for another review.

patrickvonplaten · 2023-09-11T09:58:50Z

Looks good to me! Think once the merge conflicts are corrected and once we have verified that all:

pytest tests/pipelines -k "test_model_cpu_offload_forward_pass"

works on GPU we can merge this I think.
Note that these tests don't run on PRs as they require a GPU. Did you check that they all pass?

Also I think we should slightly change the offloading method in the end: https://github.com/huggingface/diffusers/pull/4514/files#r1321307022 (wdyt?)

src/diffusers/pipelines/pipeline_utils.py

DN6 · 2023-09-11T13:39:39Z

@patrickvonplaten Getting two failures at the moment when testing. Both from Shap E. enable_full_determinism isn't set on those tests.

================================================ short test summary info =================================================
FAILED tests/pipelines/shap_e/test_shap_e.py::ShapEPipelineFastTests::test_model_cpu_offload_forward_pass - RuntimeError: cumsum_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_a...
FAILED tests/pipelines/shap_e/test_shap_e_img2img.py::ShapEImg2ImgPipelineFastTests::test_model_cpu_offload_forward_pass - RuntimeError: cumsum_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_a...

patrickvonplaten · 2023-09-11T13:48:58Z

Ok let's merge regardless of the failures and solve that afterward. Can you fix the merge conflicts and then we merge?

Co-authored-by: Patrick von Platen <[email protected]>

DN6 · 2023-09-11T15:56:45Z

@patrickvonplaten Merge conflicts resolved and added in your suggestions. There's a failing doc test, but I'm not able to reproduce it locally. Any idea what the issue might be?

patrickvonplaten · 2023-09-11T16:24:58Z

src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py

    """

    _load_connected_pipes = True
+    model_cpu_offload_seq = "text_encoder->unet->movq->prior_prior->prior_image_encoder->prior_text_encoder"


src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py

patrickvonplaten · 2023-09-11T16:27:29Z

src/diffusers/pipelines/musicldm/pipeline_musicldm.py

        latents = latents * self.scheduler.init_noise_sigma
        return latents

+    def enable_model_cpu_offload(self, gpu_id=0):


Ok for now, but why not use the default way of model offloading here?

Something I noticed while working on this.

Certain pipelines (AudioLDM2, MusicLDM, Shap E) do not make use of the forward method of their components. Instead they pass inputs into submodules of the component.

diffusers/src/diffusers/pipelines/musicldm/pipeline_musicldm.py

Line 189 in dfec61f

prompt_embeds = self.text_encoder.get_text_features(

This leads to a device mismatch error since accelerate only moves the module back to GPU when forward is called.

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu...

IMO, if Pipelines are using submodules of the components during inference, I think it's fine for them to implement their own enable_model_cpu_offload since it can be challenging for us to know exactly which modules to offload.

In this PR, I've cleaned up the enable_model_cpu_offload in the problematic pipelines to properly offload the submodules so that users still get the expected memory savings. Alternatively, we could move these problematic modules into the _exclude_from_cpu_offload list and use the enable_model_cpu_offload defined in DiffusionPipeline but that would affect memory savings.

…sion.py

* [Draft] Refactor model offload * [Draft] Refactor model offload * Apply suggestions from code review * cpu offlaod updates * remove model cpu offload from individual pipelines * add hook to offload models to cpu * clean up * model offload * add model cpu offload string * make style * clean up * fixes for offload issues * fix tests issues * resolve merge conflicts * update src/diffusers/pipelines/pipeline_utils.py Co-authored-by: Patrick von Platen <[email protected]> * make style * Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py --------- Co-authored-by: Dhruv Nair <[email protected]>

patrickvonplaten added 2 commits August 7, 2023 15:11

[Draft] Refactor model offload

aec1670

[Draft] Refactor model offload

64fbd8c

patrickvonplaten commented Aug 23, 2023

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved

Apply suggestions from code review

8317825

Kubuxu approved these changes Sep 4, 2023

View reviewed changes

DN6 added 12 commits September 6, 2023 06:02

cpu offlaod updates

b61e6d4

remove model cpu offload from individual pipelines

deb202d

add hook to offload models to cpu

31ab559

clean up

e66555c

Merge branch 'main' into refactor_model_offload

c2e4c3f

model offload

bacccf0

add model cpu offload string

4e49c34

make style

2d45adf

Merge branch 'main' into refactor_model_offload

bc4eae9

clean up

1da5a0e

fixes for offload issues

4683ea7

fix tests issues

9373fb5

dg845 mentioned this pull request Sep 9, 2023

[WIP] Refactor UniDiffuser Pipeline and Tests #4948

Merged

13 tasks

patrickvonplaten commented Sep 11, 2023

View reviewed changes

src/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved

DN6 added 2 commits September 11, 2023 14:12

Merge branch 'main' into refactor_model_offload

18eb74f

resolve merge conflicts

447701a

DN6 and others added 2 commits September 11, 2023 19:49

update src/diffusers/pipelines/pipeline_utils.py

169d29e

Co-authored-by: Patrick von Platen <[email protected]>

make style

2c52d90

patrickvonplaten commented Sep 11, 2023

View reviewed changes

src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py Outdated Show resolved Hide resolved

patrickvonplaten commented Sep 11, 2023

View reviewed changes

Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffu…

372b34f

…sion.py

patrickvonplaten merged commit 9357965 into main Sep 11, 2023

patrickvonplaten deleted the refactor_model_offload branch September 11, 2023 17:39

patrickvonplaten mentioned this pull request Sep 13, 2023

[Wuerstchen] fix compel usage #4999

Merged

clarencechen mentioned this pull request Oct 30, 2023

Update final CPU offloading code for more diffusion pipelines #5589

Merged

6 tasks

Refactor model offload #4514

Refactor model offload #4514

Uh oh!

Conversation

patrickvonplaten commented Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten commented Aug 23, 2023

Uh oh!

patrickvonplaten commented Aug 30, 2023

Uh oh!

patrickvonplaten commented Sep 4, 2023

Uh oh!

DN6 commented Sep 4, 2023

Uh oh!

Kubuxu left a comment

Choose a reason for hiding this comment

Uh oh!

DN6 commented Sep 11, 2023

Uh oh!

patrickvonplaten commented Sep 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

DN6 commented Sep 11, 2023

Uh oh!

patrickvonplaten commented Sep 11, 2023

Uh oh!

DN6 commented Sep 11, 2023

Uh oh!

patrickvonplaten Sep 11, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrickvonplaten Sep 11, 2023

Choose a reason for hiding this comment

Uh oh!

DN6 Sep 11, 2023

Choose a reason for hiding this comment

Uh oh!

DN6 Sep 11, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

patrickvonplaten commented Aug 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 7, 2023 •

edited

Loading

patrickvonplaten commented Sep 11, 2023 •

edited

Loading