Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@chuzhdontcode
Copy link
Contributor

@chuzhdontcode chuzhdontcode commented Sep 28, 2023

What does this PR do?

Fixes #5216

This PR fixes the TypeError caused by trying to directly iterate over a 0-dimension tensor in the denoising stage of GLIGEN In-painting operation.

The error occurs when using diffusion noise schedulers that iterate over timesteps
(e.g. EulerAnchestralDiscreteScheduler, KDPM2AncestralDiscreteScheduler), during in-painting operation with the StableDiffusionGLIGENPipeline and StableDiffusionGLIGENTextImagePipeline .

For further clarification, this operation of the add_noise function 🔽

step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps]
in the affected noise schedulers expects the timesteps to be a non-0 dim torch Tensor. However, in the affected pipelines, timesteps is 0-dimension.

This PR references the approach found in the StableDiffusionInpaintingPipeline

if i < len(timesteps) - 1:
noise_timestep = timesteps[i + 1]
init_latents_proper = self.scheduler.add_noise(
init_latents_proper, noise, torch.tensor([noise_timestep])
)
which is to wrap the timestep(t) 0-d tensor in a list to convert to 1-d tensor as follow 🔽

  if gligen_inpaint_image is not None:
      gligen_inpaint_latent_with_noise = (
          self.scheduler.add_noise(
              gligen_inpaint_latent, torch.randn_like(gligen_inpaint_latent), torch.tensor([t])
          )
          .expand(latents.shape[0], -1, -1, -1)
          .clone()
      )

https://github.com/rchuzh99/diffusers/blob/fb82fc4bdcead457e24a780cfb193070227f3e31/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_gligen.py#L799-L806 and https://github.com/rchuzh99/diffusers/blob/fb82fc4bdcead457e24a780cfb193070227f3e31/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_gligen_text_image.py#L960-L967

Affected pipelines

  1. StableDiffusionGLIGENPipeline
  2. StableDiffusionGLIGENTextImagePipeline

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc: @sayakpaul , @nikhil-masterful, @tuanh123789

References

  1. StableDiffusionGLIGENPipeline: Add GLIGEN implementation #4441
  2. StableDiffusionGLIGENTextImagePipeline: Add GLIGEN Text Image implementation #4777
  3. StableDiffusionInpaintingPipeline: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

zhen-hao.chu added 3 commits September 28, 2023 05:41
…list to convert to 1-d tensor

This avoids the TypeError caused by trying to directly iterate over a 0-dimensional tensor in the denoising stage
@sayakpaul
Copy link
Member

Thanks for the PR!
Let us know when it's ready for a review :)

@chuzhdontcode chuzhdontcode marked this pull request as ready for review September 28, 2023 07:25
@chuzhdontcode
Copy link
Contributor Author

chuzhdontcode commented Sep 28, 2023

Hi @sayakpaul, I have linked the issue #5216 to this PR

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@DN6 okay for you if we added a fast test to ensure feature compatibility for this PR?

@chuzhdontcode
Copy link
Contributor Author

Thanks for fixing!

@DN6 okay for you if we added a fast test to ensure feature compatibility for this PR?

No worries @sayakpaul, thanks for reviewing😀

@patrickvonplaten
Copy link
Contributor

Thanks for fixing!

@DN6 okay for you if we added a fast test to ensure feature compatibility for this PR?

Yes feel free to add a fast test

@sayakpaul
Copy link
Member

@rchuzh99 let's add a fast test then :-)

@chuzhdontcode
Copy link
Contributor Author

@rchuzh99 let's add a fast test then :-)

Ok sure, @sayakpaul @patrickvonplaten

stevhliu and others added 17 commits October 4, 2023 23:31
* fix ddim inverse scheduler

* update test of ddim inverse scheduler

* update test of pix2pix_zero

* update test of diffedit

* fix typo

---------

Co-authored-by: Patrick von Platen <[email protected]>
* Make BaseOutput dataclasses picklable

* make style

* Test

* Empty commit

* Simpler and safer
* move text encoder changes

* fix

* add comment.

* fix tests

* Update src/diffusers/utils/peft_utils.py

---------

Co-authored-by: Patrick von Platen <[email protected]>
…uggingface#5237)

Ignore PyTorch, ONNX files when they coexist with Flax weights
…face#5240)

* [PEFT warnings] Only sure deprecation warnings in the future

* make style
* added docstrings in forward methods of T2IAdapter model and FullAdapter model

* added docstrings in forward methods of FullAdapterXL and AdapterBlock models

* Added docstrings in forward methods of adapter models
* Add VAE slicing and tiling methods.

* Switch to using VaeImageProcessing for preprocessing and postprocessing of images.

* Rename the VaeImageProcessor to vae_image_processor to avoid a name clash with the CLIPImageProcessor (image_processor).

* Remove the postprocess() function because we're using a VaeImageProcessor instead.

* Remove UniDiffuserPipeline.decode_image_latents because we're using VaeImageProcessor instead.

* Refactor generating text from text latents into a decode_text_latents method.

* Add enable_full_determinism() to UniDiffuser tests.

* make style

* Add PipelineLatentTesterMixin to UniDiffuserPipelineFastTests.

* Remove enable_model_cpu_offload since it is now part of DiffusionPipeline.

* Rename the VaeImageProcessor instance to self.image_processor for consistency with other pipelines and rename the CLIPImageProcessor instance to clip_image_processor to avoid a name clash.

* Update UniDiffuser conversion script.

* Make safe_serialization configurable in UniDiffuser conversion script.

* Rename image_processor to clip_image_processor in UniDiffuser tests.

* Add PipelineKarrasSchedulerTesterMixin to UniDiffuserPipelineFastTests.

* Add initial test for compiling the UniDiffuser model (not tested yet).

* Update encode_prompt and _encode_prompt to match that of StableDiffusionPipeline.

* Turn off standard classifier-free guidance for now.

* make style

* make fix-copies

* apply suggestions from review

---------

Co-authored-by: Patrick von Platen <[email protected]>
* fix: how print training resume logs.

* propagate changes to text-to-image scripts.

* propagate changes to instructpix2pix.

* propagate changes to dreambooth

* propagate changes to custom diffusion and instructpix2pix

* propagate changes to kandinsky

* propagate changes to textual inv.

* debug

* fix: checkpointing.

* debug

* debug

* debug

* back to the square

* debug

* debug

* change condition order.

* debug

* debug

* debug

* debug

* revert to original

* clean

---------

Co-authored-by: Patrick von Platen <[email protected]>
* Add docstring for the AutoencoderKL's decode

huggingface#5230

* Follow the style guidelines in AutoencoderKL's decode

huggingface#5230

---------

Co-authored-by: stano <>
* Add docstring for the AutoencoderKL's encode

huggingface#5229

* Support Python 3.8 syntax in AutoencoderKL.decode type hints

Co-authored-by: Patrick von Platen <[email protected]>

* Follow the style guidelines in AutoencoderKL's encode

huggingface#5230

---------

Co-authored-by: stano <>
Co-authored-by: Patrick von Platen <[email protected]>
* Update Unipc einsum to support 1D and 3D diffusion.

* Add unittest

* Update unittest & edge case

* Fix unittest

* Fix testing_utils.py

* Fix unittest file

---------

Co-authored-by: Patrick von Platen <[email protected]>
patrickvonplaten and others added 8 commits October 4, 2023 23:31
* fix all

* make fix copies

* make fix copies
* [SDXL Flax] Add research folder

* Add co-author

Co-authored-by: Juan Acevedo <[email protected]>

---------

Co-authored-by: Juan Acevedo <[email protected]>
* pipline fetcher

* update script

* clean up

* clean up

* clean up

* new pipeline runner

* rename tests to match modules

* test actions in pr

* change runner to gpu

* clean up

* clean up

* clean up

* fix report

* fix reporting

* clean up

* show test stats in failure reports

* give names to jobs

* add lora tests

* split torch cuda tests and add compile tests

* clean up

* fix tests

* change push to run only on main

---------

Co-authored-by: Patrick von Platen <[email protected]>
* handle case when controlnet is list

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* typecheck comment

---------

Co-authored-by: Patrick von Platen <[email protected]>
* Update _toctree.yml

* Add files via upload

* Update docs/source/zh/stable_diffusion.md

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
@chuzhdontcode
Copy link
Contributor Author

chuzhdontcode commented Oct 5, 2023

Hi @sayakpaul , I am working on the fast test for this PR and I would like to seek some suggestions, if I may.
I am adding the unit test which uses the EulerAncestralDiscreteScheduler instead of the default DDIMScheduler.
In preparation, I did study other unit tests for other pipelines and for an example, test_stable_diffusion.py.

I was wondering is it better to reused the same method as shown below:

  1. Method 1:

def test_stable_diffusion_k_euler_ancestral(self):
device = "cpu" # ensure determinism for the device-dependent torch.Generator
components = self.get_dummy_components()
sd_pipe = StableDiffusionPipeline(**components)
sd_pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(sd_pipe.scheduler.config)
sd_pipe = sd_pipe.to(device)
sd_pipe.set_progress_bar_config(disable=None)
inputs = self.get_dummy_inputs(device)
output = sd_pipe(**inputs)
image = output.images
image_slice = image[0, -3:, -3:, -1]
assert image.shape == (1, 64, 64, 3)
expected_slice = np.array([0.4872, 0.5444, 0.4846, 0.5003, 0.5549, 0.4850, 0.5189, 0.4941, 0.5067])
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2

==> my implementation

    def test_stable_diffusion_gligen_k_euler_ancestral(self):
        device = "cpu"  # ensure determinism for the device-dependent torch.Generator
        components = self.get_dummy_components()
        sd_pipe = StableDiffusionGLIGENPipeline(**components)
        sd_pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(sd_pipe.scheduler.config)
        sd_pipe = sd_pipe.to(device)
        sd_pipe.set_progress_bar_config(disable=None)

        inputs = self.get_dummy_inputs(device)
        output = sd_pipe(**inputs)
        image = output.images
        image_slice = image[0, -3:, -3:, -1]

        assert image.shape == (1, 64, 64, 3)
        expected_slice = np.array([0.425, 0.494, 0.429, 0.469, 0.525, 0.417, 0.533, 0.5, 0.47])

        assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2
  1. Method 2:
    This method uses parameterization with both schedulers on the test_gligen

==> my implementation

    def get_dummy_components(self, scheduler_cls=DDIMScheduler, scheduler_kwargs: Dict = {}):
        torch.manual_seed(0)
        unet = UNet2DConditionModel(
            block_out_channels=(32, 64),
            layers_per_block=2,
            sample_size=32,
            in_channels=4,
            out_channels=4,
            down_block_types=("DownBlock2D", "CrossAttnDownBlock2D"),
            up_block_types=("CrossAttnUpBlock2D", "UpBlock2D"),
            cross_attention_dim=32,
            attention_type="gated",
        )
        # unet.position_net = PositionNet(32,32)
        scheduler = scheduler_cls(
            beta_start=0.00085,
            beta_end=0.012,
            beta_schedule="scaled_linear",
            **scheduler_kwargs,
        )
...
    @parameterized.expand(
        [
            (
                "DDIMScheduler",
                DDIMScheduler,
                {
                    "clip_sample": False,
                    "set_alpha_to_one": False,
                },
                [0.5069, 0.5561, 0.4577, 0.4792, 0.5203, 0.4089, 0.5039, 0.4919, 0.4499],
            ),
            (
                "EulerAncestralDiscreteScheduler",
                EulerAncestralDiscreteScheduler,
                {},
                [0.425, 0.494, 0.429, 0.469, 0.525, 0.417, 0.533, 0.5, 0.47],
            ),
        ]
    )
    def test_gligen(self, name, scheduler_cls, scheduler_kwargs: Dict, expected_slice: List):
        device = "cpu"  # ensure determinism for the device-dependent torch.Generator
        components = self.get_dummy_components(scheduler_cls=scheduler_cls, scheduler_kwargs=scheduler_kwargs)
        sd_pipe = StableDiffusionGLIGENPipeline(**components)
        sd_pipe = sd_pipe.to(device)
        sd_pipe.set_progress_bar_config(disable=None)

        inputs = self.get_dummy_inputs(device)
        image = sd_pipe(**inputs).images
        image_slice = image[0, -3:, -3:, -1]

        assert image.shape == (1, 64, 64, 3)

        expected_slice = np.array(expected_slice)

        assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2

@chuzhdontcode
Copy link
Contributor Author

Moreover, @sayakpaul . I have accidentally perform a pull --rebase operation instead of merge 🙏🏻 . Right now the PR is populated with the previous commits from the forked main branch. I am thinking of creating a new PR which refers to a separate branch

@sayakpaul
Copy link
Member

@rchuzh99 yeah seems like the PR is borked. It would make sense to open a separate one, instead.

@chuzhdontcode
Copy link
Contributor Author

@sayakpaul , okok noted 👍🏻

@chuzhdontcode
Copy link
Contributor Author

Fix commit history with new PR

@chuzhdontcode
Copy link
Contributor Author

Reopened at #5305

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛[core/GLIGEN]: TypeError when iterating over 0-d tensor with In-painting mode when EulerAncestralDiscreteScheduler is used