Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@richardSHkim
Copy link
Contributor

@richardSHkim richardSHkim commented Sep 22, 2023

What does this PR do?

  • DDIMInverseScheduler is currently working differently from ddim inversion in prompt-to-prompt.
  • With this PR, the first timestep value (t) for UNet input becomes 1 instead of -19 as done in prompt-to-prompt.
  • Timestep values in step() function of DDIMInverseScheduler are same as before (starting from -19).

Example code and some results.

import io
import requests
import torch
from PIL import Image
import numpy as np

from diffusers import StableDiffusionPipeline, DDIMScheduler, DDIMInverseScheduler


def load_512(image_path, left=0, right=0, top=0, bottom=0, size=512):
    if type(image_path) is str:
        image = np.array(Image.open(image_path))[:, :, :3]
    else:
        image = image_path
    h, w, c = image.shape
    left = min(left, w-1)
    right = min(right, w - left - 1)
    top = min(top, h - left - 1)
    bottom = min(bottom, h - top - 1)
    image = image[top:h-bottom, left:w-right]
    h, w, c = image.shape
    if h < w:
        offset = (w - h) // 2
        image = image[:, offset:offset + h]
    elif w < h:
        offset = (h - w) // 2
        image = image[offset:offset + w]
    image = np.array(Image.fromarray(image).resize((size, size)))
    return image


if __name__=="__main__":
    # configs
    device = "cuda:0"
    num_inference_steps = 50
    guidance_scale = 1.0

    # load model
    model_id = "runwayml/stable-diffusion-v1-5"
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32)
    pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
    pipe.to(device)

    # download image
    # example 1
    r = requests.get("https://github.com/google/prompt-to-prompt/raw/main/example_images/gnochi_mirror.jpeg", timeout=4.0)
    prompt = "a cat sitting next to a mirror"

    # example 2
    # r = requests.get("https://github.com/pix2pixzero/pix2pix-zero/raw/main/assets/test_images/cats/cat_6.png", timeout=4.0)
    # prompt = "a photography of a black and white kitten in a field of daies"
    
    image =  Image.open(io.BytesIO(r.content))
    image = load_512(np.array(image))
    image = Image.fromarray(image)

    # get image latents
    with torch.no_grad():
        init_image = pipe.image_processor.preprocess(image, height=512, width=512)
        init_image = init_image.to(device)
        image_latents = pipe.vae.encode(init_image)['latent_dist'].mean
        image_latents = pipe.vae.config.scaling_factor * image_latents

    # vae reconstruction
    with torch.no_grad():
        image_vae_recon = pipe.vae.decode(image_latents / pipe.vae.config.scaling_factor)['sample']
        image_vae_recon = pipe.image_processor.postprocess(image_vae_recon, output_type='pil')[0]

    # ddim inversion
    pipe.scheduler = DDIMInverseScheduler.from_config(pipe.scheduler.config)
    inversion_latents = pipe(prompt, 
                             num_inference_steps=num_inference_steps, 
                             guidance_scale=guidance_scale, 
                             latents=image_latents, 
                             output_type="latent",
                             )['images']

    # ddim inversion reconstruction
    pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
    image_ddim_recon = pipe(prompt, 
                            num_inference_steps=num_inference_steps, 
                            guidance_scale=guidance_scale, 
                            latents=inversion_latents,
                            )['images'][0]

    # save the results
    Image.fromarray(np.concatenate([np.array(image), 
                                    np.array(image_vae_recon), 
                                    np.array(image_ddim_recon),
                                    ], axis=1)).save("ddim_inversion.png")

Before

From left to right: the original image, the vq-autoencoder reconstruction, the ddim inverted image
ddim_inversion_before

After

ddim_inversion_after

Before

ddim_inversion_test_before

After

ddim_inversion_test_after

Test with StableDiffusionPix2PixZeroPipeline

test code from: #2397

Source Image

cat_6

Before

pix2pixzero_before

After

pix2pixzero_after

Before submitting

Who can review?

@patrickvonplaten @clarencechen

@patrickvonplaten
Copy link
Contributor

I can reproduce the results from @richardSHkim ! Agree that this was a bug before and that this PR should fix it.

@clarencechen can you also take a look? @richardSHkim can you maybe update the tests of the "Fast tests for PRs" suite:

FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_no_noise - assert 162.57374062500003 < 0.01
 +  where 162.57374062500003 = abs((671.681640625 - 509.1079))
 +    where 671.681640625 = <built-in method item of Tensor object at 0x7fc38d2cfe00>()
 +      where <built-in method item of Tensor object at 0x7fc38d2cfe00> = tensor(671.6816).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_no_set_alpha_to_one - assert 303.61718017578124 < 0.01
 +  where 303.61718017578124 = abs((542.6721801757812 - 239.055))
 +    where 542.6721801757812 = <built-in method item of Tensor object at 0x7fc3b47e26d0>()
 +      where <built-in method item of Tensor object at 0x7fc3b47e26d0> = tensor(542.6722).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_set_alpha_to_one - assert 280.150558203125 < 0.01
 +  where 280.150558203125 = abs((539.962158203125 - 259.8116))
 +    where 539.962158203125 = <built-in method item of Tensor object at 0x7fc3b46cdea0>()
 +      where <built-in method item of Tensor object at 0x7fc3b46cdea0> = tensor(539.9622).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_full_loop_with_v_prediction - assert 365.0895058593751 < 0.01
 +  where 365.0895058593751 = abs((1394.218505859375 - 1029.129))
 +    where 1394.218505859375 = <built-in method item of Tensor object at 0x7fc3b46a5770>()
 +      where <built-in method item of Tensor object at 0x7fc3b46a5770> = tensor(1394.2185).item
FAILED tests/schedulers/test_scheduler_ddim_inverse.py::DDIMInverseSchedulerTest::test_steps_offset - assert False
 +  where False = <built-in method equal of type object at 0x7fc3d7019de0>(tensor([  1, 201, 401, 601, 801]), tensor([-199,    1,  201,  401,  601]))
 +    where <built-in method equal of type object at 0x7fc3d7019de0> = torch.equal
 +    and   tensor([  1, 201, 401, 601, 801]) = DDIMInverseScheduler {\n  "_class_name": "DDIMInverseScheduler",\n  "_diffusers_version": "0.22.0.dev0",\n  "beta_end": 0.02,\n  "beta_schedule": "linear",\n  "beta_start": 0.0001,\n  "clip_sample": true,\n  "clip_sample_range": 1.0,\n  "num_train_timesteps": 1000,\n  "prediction_type": "epsilon",\n  "rescale_betas_zero_snr": false,\n  "set_alpha_to_one": true,\n  "steps_offset": 1,\n  "timestep_spacing": "leading",\n  "trained_betas": null\n}\n.timesteps
 +    and   tensor([-199,    1,  201,  401,  601]) = <class 'torch.LongTensor'>([-199, 1, 201, 401, 601])
 +      where <class 'torch.LongTensor'> = torch.LongTensor

?

So that we can merge the PR here? :-)

@richardSHkim
Copy link
Contributor Author

@patrickvonplaten Thank you for your check.
I have updated tests of ddim inverse scheduler along with tests of related pipelines (pix2pix zero, diffedit).

@patrickvonplaten
Copy link
Contributor

Waiting a couple more days for potential feedback from @clarencechen but this should be good to merge

@clarencechen
Copy link
Contributor

Hey @richardSHkim, thanks for doing thorough tests with your PR to check result quality. Early implementations of DDIM inversion were slightly different across research papers and projects in this respect, and I was too focused on making sure that the scheduler steps exactly mirrored the forward (denoising) process to pay attention to the timestep value fed into the UNet. Please merge.

@richardSHkim
Copy link
Contributor Author

@clarencechen Thanks for your confirmation, and I appreciate your previous work on inverse schedulers!

@patrickvonplaten
Copy link
Contributor

Awesome! Merging then and fixing the code quality PR directly on main

@patrickvonplaten
Copy link
Contributor

Great job @richardSHkim

@patrickvonplaten patrickvonplaten merged commit 9c03a7d into huggingface:main Sep 29, 2023
chuzhdontcode pushed a commit to chuzhdontcode/diffusers that referenced this pull request Oct 4, 2023
* fix ddim inverse scheduler

* update test of ddim inverse scheduler

* update test of pix2pix_zero

* update test of diffedit

* fix typo

---------

Co-authored-by: Patrick von Platen <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* fix ddim inverse scheduler

* update test of ddim inverse scheduler

* update test of pix2pix_zero

* update test of diffedit

* fix typo

---------

Co-authored-by: Patrick von Platen <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* fix ddim inverse scheduler

* update test of ddim inverse scheduler

* update test of pix2pix_zero

* update test of diffedit

* fix typo

---------

Co-authored-by: Patrick von Platen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants