Fix DDIMInverseScheduler #5145

richardSHkim · 2023-09-22T09:39:12Z

What does this PR do?

DDIMInverseScheduler is currently working differently from ddim inversion in prompt-to-prompt.
With this PR, the first timestep value (t) for UNet input becomes 1 instead of -19 as done in prompt-to-prompt.
Timestep values in step() function of DDIMInverseScheduler are same as before (starting from -19).

Example code and some results.

import io
import requests
import torch
from PIL import Image
import numpy as np

from diffusers import StableDiffusionPipeline, DDIMScheduler, DDIMInverseScheduler


def load_512(image_path, left=0, right=0, top=0, bottom=0, size=512):
    if type(image_path) is str:
        image = np.array(Image.open(image_path))[:, :, :3]
    else:
        image = image_path
    h, w, c = image.shape
    left = min(left, w-1)
    right = min(right, w - left - 1)
    top = min(top, h - left - 1)
    bottom = min(bottom, h - top - 1)
    image = image[top:h-bottom, left:w-right]
    h, w, c = image.shape
    if h < w:
        offset = (w - h) // 2
        image = image[:, offset:offset + h]
    elif w < h:
        offset = (h - w) // 2
        image = image[offset:offset + w]
    image = np.array(Image.fromarray(image).resize((size, size)))
    return image


if __name__=="__main__":
    # configs
    device = "cuda:0"
    num_inference_steps = 50
    guidance_scale = 1.0

    # load model
    model_id = "runwayml/stable-diffusion-v1-5"
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32)
    pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
    pipe.to(device)

    # download image
    # example 1
    r = requests.get("https://github.com/google/prompt-to-prompt/raw/main/example_images/gnochi_mirror.jpeg", timeout=4.0)
    prompt = "a cat sitting next to a mirror"

    # example 2
    # r = requests.get("https://github.com/pix2pixzero/pix2pix-zero/raw/main/assets/test_images/cats/cat_6.png", timeout=4.0)
    # prompt = "a photography of a black and white kitten in a field of daies"
    
    image =  Image.open(io.BytesIO(r.content))
    image = load_512(np.array(image))
    image = Image.fromarray(image)

    # get image latents
    with torch.no_grad():
        init_image = pipe.image_processor.preprocess(image, height=512, width=512)
        init_image = init_image.to(device)
        image_latents = pipe.vae.encode(init_image)['latent_dist'].mean
        image_latents = pipe.vae.config.scaling_factor * image_latents

    # vae reconstruction
    with torch.no_grad():
        image_vae_recon = pipe.vae.decode(image_latents / pipe.vae.config.scaling_factor)['sample']
        image_vae_recon = pipe.image_processor.postprocess(image_vae_recon, output_type='pil')[0]

    # ddim inversion
    pipe.scheduler = DDIMInverseScheduler.from_config(pipe.scheduler.config)
    inversion_latents = pipe(prompt, 
                             num_inference_steps=num_inference_steps, 
                             guidance_scale=guidance_scale, 
                             latents=image_latents, 
                             output_type="latent",
                             )['images']

    # ddim inversion reconstruction
    pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
    image_ddim_recon = pipe(prompt, 
                            num_inference_steps=num_inference_steps, 
                            guidance_scale=guidance_scale, 
                            latents=inversion_latents,
                            )['images'][0]

    # save the results
    Image.fromarray(np.concatenate([np.array(image), 
                                    np.array(image_vae_recon), 
                                    np.array(image_ddim_recon),
                                    ], axis=1)).save("ddim_inversion.png")

Before

From left to right: the original image, the vq-autoencoder reconstruction, the ddim inverted image

After

Before

After

Test with `StableDiffusionPix2PixZeroPipeline`

test code from: #2397

Source Image

Before

After

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@patrickvonplaten @clarencechen

patrickvonplaten · 2023-09-25T17:09:36Z

I can reproduce the results from @richardSHkim ! Agree that this was a bug before and that this PR should fix it.

@clarencechen can you also take a look? @richardSHkim can you maybe update the tests of the "Fast tests for PRs" suite:

FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_no_noise - assert 162.57374062500003 < 0.01
 +  where 162.57374062500003 = abs((671.681640625 - 509.1079))
 +    where 671.681640625 = <built-in method item of Tensor object at 0x7fc38d2cfe00>()
 +      where <built-in method item of Tensor object at 0x7fc38d2cfe00> = tensor(671.6816).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_no_set_alpha_to_one - assert 303.61718017578124 < 0.01
 +  where 303.61718017578124 = abs((542.6721801757812 - 239.055))
 +    where 542.6721801757812 = <built-in method item of Tensor object at 0x7fc3b47e26d0>()
 +      where <built-in method item of Tensor object at 0x7fc3b47e26d0> = tensor(542.6722).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_set_alpha_to_one - assert 280.150558203125 < 0.01
 +  where 280.150558203125 = abs((539.962158203125 - 259.8116))
 +    where 539.962158203125 = <built-in method item of Tensor object at 0x7fc3b46cdea0>()
 +      where <built-in method item of Tensor object at 0x7fc3b46cdea0> = tensor(539.9622).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_v_prediction - assert 365.0895058593751 < 0.01
 +  where 365.0895058593751 = abs((1394.218505859375 - 1029.129))
 +    where 1394.218505859375 = <built-in method item of Tensor object at 0x7fc3b46a5770>()
 +      where <built-in method item of Tensor object at 0x7fc3b46a5770> = tensor(1394.2185).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_steps_offset - assert False
 +  where False = <built-in method equal of type object at 0x7fc3d7019de0>(tensor([  1, 201, 401, 601, 801]), tensor([-199,    1,  201,  401,  601]))
 +    where <built-in method equal of type object at 0x7fc3d7019de0> = torch.equal
 +    and   tensor([  1, 201, 401, 601, 801]) = DDIMInverseScheduler {\n  "_class_name": "DDIMInverseScheduler",\n  "_diffusers_version": "0.22.0.dev0",\n  "beta_end": 0.02,\n  "beta_schedule": "linear",\n  "beta_start": 0.0001,\n  "clip_sample": true,\n  "clip_sample_range": 1.0,\n  "num_train_timesteps": 1000,\n  "prediction_type": "epsilon",\n  "rescale_betas_zero_snr": false,\n  "set_alpha_to_one": true,\n  "steps_offset": 1,\n  "timestep_spacing": "leading",\n  "trained_betas": null\n}\n.timesteps
 +    and   tensor([-199,    1,  201,  401,  601]) = <class 'torch.LongTensor'>([-199, 1, 201, 401, 601])
 +      where <class 'torch.LongTensor'> = torch.LongTensor

?

So that we can merge the PR here? :-)

richardSHkim · 2023-09-26T01:15:35Z

@patrickvonplaten Thank you for your check.
I have updated tests of ddim inverse scheduler along with tests of related pipelines (pix2pix zero, diffedit).

…users into ddim-inversion

patrickvonplaten · 2023-09-26T12:26:50Z

Waiting a couple more days for potential feedback from @clarencechen but this should be good to merge

clarencechen · 2023-09-26T18:31:01Z

Hey @richardSHkim, thanks for doing thorough tests with your PR to check result quality. Early implementations of DDIM inversion were slightly different across research papers and projects in this respect, and I was too focused on making sure that the scheduler steps exactly mirrored the forward (denoising) process to pay attention to the timestep value fed into the UNet. Please merge.

richardSHkim · 2023-09-27T08:10:20Z

@clarencechen Thanks for your confirmation, and I appreciate your previous work on inverse schedulers!

patrickvonplaten · 2023-09-27T11:10:28Z

Awesome! Merging then and fixing the code quality PR directly on main

patrickvonplaten · 2023-09-29T06:54:58Z

Great job @richardSHkim

* fix ddim inverse scheduler * update test of ddim inverse scheduler * update test of pix2pix_zero * update test of diffedit * fix typo --------- Co-authored-by: Patrick von Platen <[email protected]>

fix ddim inverse scheduler

e9eb58a

richardSHkim and others added 4 commits September 26, 2023 10:08

update test of ddim inverse scheduler

860106d

update test of pix2pix_zero

848395f

update test of diffedit

b29ebb8

Merge branch 'main' into ddim-inversion

9235ec2

richardSHkim added 2 commits September 26, 2023 10:21

fix typo

345a94f

Merge branch 'ddim-inversion' of https://github.com/richardSHkim/diff…

0d59bdf

…users into ddim-inversion

richardSHkim and others added 2 commits September 28, 2023 18:32

Merge branch 'main' into ddim-inversion

3dde2a4

Merge branch 'main' into ddim-inversion

c026909

patrickvonplaten merged commit 9c03a7d into huggingface:main Sep 29, 2023

patrickvonplaten mentioned this pull request Oct 9, 2023

Bug in DDIMInverseScheduler function step #5315

Closed

patrickvonplaten mentioned this pull request Nov 21, 2023

Fix bug in DDIMInverseSampler #5832

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DDIMInverseScheduler #5145

Fix DDIMInverseScheduler #5145

Uh oh!

richardSHkim commented Sep 22, 2023 •

edited

Loading

Uh oh!

patrickvonplaten commented Sep 25, 2023

Uh oh!

richardSHkim commented Sep 26, 2023

Uh oh!

patrickvonplaten commented Sep 26, 2023

Uh oh!

clarencechen commented Sep 26, 2023

Uh oh!

richardSHkim commented Sep 27, 2023

Uh oh!

patrickvonplaten commented Sep 27, 2023

Uh oh!

patrickvonplaten commented Sep 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix DDIMInverseScheduler #5145

Fix DDIMInverseScheduler #5145

Uh oh!

Conversation

richardSHkim commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Example code and some results.

Before

After

Before

After

Test with StableDiffusionPix2PixZeroPipeline

Source Image

Before

After

Before submitting

Who can review?

Uh oh!

patrickvonplaten commented Sep 25, 2023

Uh oh!

richardSHkim commented Sep 26, 2023

Uh oh!

patrickvonplaten commented Sep 26, 2023

Uh oh!

clarencechen commented Sep 26, 2023

Uh oh!

richardSHkim commented Sep 27, 2023

Uh oh!

patrickvonplaten commented Sep 27, 2023

Uh oh!

patrickvonplaten commented Sep 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

richardSHkim commented Sep 22, 2023 •

edited

Loading

Test with `StableDiffusionPix2PixZeroPipeline`