Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Unable to load FLUX.1-Canny-dev-lora into FluxControlPipeline #11464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
laolongboy opened this issue Apr 30, 2025 · 7 comments
Closed

Unable to load FLUX.1-Canny-dev-lora into FluxControlPipeline #11464

laolongboy opened this issue Apr 30, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@laolongboy
Copy link

Describe the bug

I tryied to run the example code of FLUX.1-Canny-dev-lora from https://huggingface.co/docs/diffusers/v0.33.1/en/api/pipelines/flux#canny-control, but get error:

RuntimeError: Error(s) in loading state_dict for FluxTransformer2DModel:
size mismatch for proj_out.lora_A.default_0.weight: copying a param with shape torch.Size([64, 3072]) from checkpoint, the shape in current model is torch.Size([128, 3072]).
size mismatch for proj_out.lora_B.default_0.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([64, 128]).

I checked the code inside the pipeline. It concats noisy tokens and condition tokens in the channel dimension, which change input shape from [batch, token_length, 64] to [batch, token_length, 128].
(https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux_control.py#L830-L831)

Therefore, the LoRA parameters of the first layer are inconsistent with the basic Flux model.

Reproduction

# !pip install -U controlnet-aux
import torch
from controlnet_aux import CannyDetector
from diffusers import FluxControlPipeline
from diffusers.utils import load_image

pipe = FluxControlPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16).to("cuda")
pipe.load_lora_weights("black-forest-labs/FLUX.1-Canny-dev-lora")

prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

processor = CannyDetector()
control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)

image = pipe(
    prompt=prompt,
    control_image=control_image,
    height=1024,
    width=1024,
    num_inference_steps=50,
    guidance_scale=30.0,
).images[0]
image.save("output.png")

Logs

System Info

  • 🤗 Diffusers version: 0.33.1
  • Platform: Linux-5.10.134-13.an8.x86_64-x86_64-with-glibc2.31
  • Running on Google Colab?: No
  • Python version: 3.10.16
  • PyTorch version (GPU?): 2.6.0+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.29.2
  • Transformers version: 4.43.3
  • Accelerate version: 0.30.1
  • PEFT version: 0.14.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.5.3
  • xFormers version: not installed
  • Accelerator: NVIDIA L40S, 46068 MiB
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help?

No response

@laolongboy laolongboy added the bug Something isn't working label Apr 30, 2025
@DN6
Copy link
Collaborator

DN6 commented May 1, 2025

Hi @laolongboy I'm unable to reproduce this on my end. Could you try running in the example is a fresh virtual environment?

@laolongboy
Copy link
Author

Hi @laolongboy I'm unable to reproduce this on my end. Could you try running in the example is a fresh virtual environment?

Thanks for reply. I tried on another env and run successfully in diffusers==0.32.2. But same error occured in diffusers==0.33.1.

@DN6
Copy link
Collaborator

DN6 commented May 6, 2025

@laolongboy Could you try updating PEFT to the latest version?

@sayakpaul
Copy link
Member

Hello @laolongboy. Sorry about this. It was a conscious decision as it was getting increasingly difficult to support non-diffusers formats where there are non-ordinary state dict patterns. #10985 (comment) provides more context.

So, the easiest thing would be to update your peft version.

@laolongboy
Copy link
Author

Hello @laolongboy. Sorry about this. It was a conscious decision as it was getting increasingly difficult to support non-diffusers formats where there are non-ordinary state dict patterns. #10985 (comment) provides more context.

So, the easiest thing would be to update your peft version.

It works but I still feel confused.

I load this lora into Flux.1-dev and run pipeline.fuse_lora(), then compare the parameters in transformer before and after.

The only difference is x_embedder:

Image

As I analyzed in the beginning, FLUX.1-Canny-dev-lora concats canny input and noised input in channel dimension, so the base model and the lora model have different size in the first layer.

In my understanding, the sizes of the base model(W) and the LoRA model(BA) should be completely aligned (out = Wx + BAx).
w is 3072x64, BA is 3072x128 in this case.

Is there any problem with my understanding? Or some conversions completed internally in Diffusers or PEFT?

@sayakpaul
Copy link
Member

I think we're now digressing from the original issue then. The issue, IIUC, was about not being able to load LoRAs. What you're telling now is a different problem and should come through a new issue thread. So, I would suggest closing this thread if you think that's okay and opening a new issue/discussion to clarify further doubts.

@laolongboy
Copy link
Author

I think we're now digressing from the original issue then. The issue, IIUC, was about not being able to load LoRAs. What you're telling now is a different problem and should come through a new issue thread. So, I would suggest closing this thread if you think that's okay and opening a new issue/discussion to clarify further doubts.

I just want to know what's the reason behind this. I finally found where the magic happens and it has solved my doubt. Thanks🍺

3d735b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants