-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Fix [core/GLIGEN]: TypeError when iterating over 0-d tensor with In-painting mode when EulerAncestralDiscreteScheduler is used
#5214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…list to convert to 1-d tensor This avoids the TypeError caused by trying to directly iterate over a 0-dimensional tensor in the denoising stage
…rchuzh99/fix-gligen-add-noise-timestep
|
Thanks for the PR! |
|
Hi @sayakpaul, I have linked the issue #5216 to this PR |
sayakpaul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
@DN6 okay for you if we added a fast test to ensure feature compatibility for this PR?
No worries @sayakpaul, thanks for reviewing😀 |
Yes feel free to add a fast test |
|
@rchuzh99 let's add a fast test then :-) |
Ok sure, @sayakpaul @patrickvonplaten |
* fix * feedback
* fix ddim inverse scheduler * update test of ddim inverse scheduler * update test of pix2pix_zero * update test of diffedit * fix typo --------- Co-authored-by: Patrick von Platen <[email protected]>
* Make BaseOutput dataclasses picklable * make style * Test * Empty commit * Simpler and safer
* move text encoder changes * fix * add comment. * fix tests * Update src/diffusers/utils/peft_utils.py --------- Co-authored-by: Patrick von Platen <[email protected]>
Fix indent issue
…uggingface#5237) Ignore PyTorch, ONNX files when they coexist with Flax weights
compile test fixes
…face#5240) * [PEFT warnings] Only sure deprecation warnings in the future * make style
* added docstrings in forward methods of T2IAdapter model and FullAdapter model * added docstrings in forward methods of FullAdapterXL and AdapterBlock models * Added docstrings in forward methods of adapter models
* Add VAE slicing and tiling methods. * Switch to using VaeImageProcessing for preprocessing and postprocessing of images. * Rename the VaeImageProcessor to vae_image_processor to avoid a name clash with the CLIPImageProcessor (image_processor). * Remove the postprocess() function because we're using a VaeImageProcessor instead. * Remove UniDiffuserPipeline.decode_image_latents because we're using VaeImageProcessor instead. * Refactor generating text from text latents into a decode_text_latents method. * Add enable_full_determinism() to UniDiffuser tests. * make style * Add PipelineLatentTesterMixin to UniDiffuserPipelineFastTests. * Remove enable_model_cpu_offload since it is now part of DiffusionPipeline. * Rename the VaeImageProcessor instance to self.image_processor for consistency with other pipelines and rename the CLIPImageProcessor instance to clip_image_processor to avoid a name clash. * Update UniDiffuser conversion script. * Make safe_serialization configurable in UniDiffuser conversion script. * Rename image_processor to clip_image_processor in UniDiffuser tests. * Add PipelineKarrasSchedulerTesterMixin to UniDiffuserPipelineFastTests. * Add initial test for compiling the UniDiffuser model (not tested yet). * Update encode_prompt and _encode_prompt to match that of StableDiffusionPipeline. * Turn off standard classifier-free guidance for now. * make style * make fix-copies * apply suggestions from review --------- Co-authored-by: Patrick von Platen <[email protected]>
* fix: how print training resume logs. * propagate changes to text-to-image scripts. * propagate changes to instructpix2pix. * propagate changes to dreambooth * propagate changes to custom diffusion and instructpix2pix * propagate changes to kandinsky * propagate changes to textual inv. * debug * fix: checkpointing. * debug * debug * debug * back to the square * debug * debug * change condition order. * debug * debug * debug * debug * revert to original * clean --------- Co-authored-by: Patrick von Platen <[email protected]>
* Add docstring for the AutoencoderKL's decode huggingface#5230 * Follow the style guidelines in AutoencoderKL's decode huggingface#5230 --------- Co-authored-by: stano <>
* Add docstring for the AutoencoderKL's encode huggingface#5229 * Support Python 3.8 syntax in AutoencoderKL.decode type hints Co-authored-by: Patrick von Platen <[email protected]> * Follow the style guidelines in AutoencoderKL's encode huggingface#5230 --------- Co-authored-by: stano <> Co-authored-by: Patrick von Platen <[email protected]>
* Update Unipc einsum to support 1D and 3D diffusion. * Add unittest * Update unittest & edge case * Fix unittest * Fix testing_utils.py * Fix unittest file --------- Co-authored-by: Patrick von Platen <[email protected]>
* fix all * make fix copies * make fix copies
* [SDXL Flax] Add research folder * Add co-author Co-authored-by: Juan Acevedo <[email protected]> --------- Co-authored-by: Juan Acevedo <[email protected]>
* pipline fetcher * update script * clean up * clean up * clean up * new pipeline runner * rename tests to match modules * test actions in pr * change runner to gpu * clean up * clean up * clean up * fix report * fix reporting * clean up * show test stats in failure reports * give names to jobs * add lora tests * split torch cuda tests and add compile tests * clean up * fix tests * change push to run only on main --------- Co-authored-by: Patrick von Platen <[email protected]>
* handle case when controlnet is list * Update src/diffusers/loaders.py * Apply suggestions from code review * Update src/diffusers/loaders.py * typecheck comment --------- Co-authored-by: Patrick von Platen <[email protected]>
* Update _toctree.yml * Add files via upload * Update docs/source/zh/stable_diffusion.md Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]>
|
Hi @sayakpaul , I am working on the fast test for this PR and I would like to seek some suggestions, if I may. I was wondering is it better to reused the same method as shown below:
diffusers/tests/pipelines/stable_diffusion/test_stable_diffusion.py Lines 382 to 399 in e46ec5f
==> my implementation def test_stable_diffusion_gligen_k_euler_ancestral(self):
device = "cpu" # ensure determinism for the device-dependent torch.Generator
components = self.get_dummy_components()
sd_pipe = StableDiffusionGLIGENPipeline(**components)
sd_pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(sd_pipe.scheduler.config)
sd_pipe = sd_pipe.to(device)
sd_pipe.set_progress_bar_config(disable=None)
inputs = self.get_dummy_inputs(device)
output = sd_pipe(**inputs)
image = output.images
image_slice = image[0, -3:, -3:, -1]
assert image.shape == (1, 64, 64, 3)
expected_slice = np.array([0.425, 0.494, 0.429, 0.469, 0.525, 0.417, 0.533, 0.5, 0.47])
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2
==> my implementation def get_dummy_components(self, scheduler_cls=DDIMScheduler, scheduler_kwargs: Dict = {}):
torch.manual_seed(0)
unet = UNet2DConditionModel(
block_out_channels=(32, 64),
layers_per_block=2,
sample_size=32,
in_channels=4,
out_channels=4,
down_block_types=("DownBlock2D", "CrossAttnDownBlock2D"),
up_block_types=("CrossAttnUpBlock2D", "UpBlock2D"),
cross_attention_dim=32,
attention_type="gated",
)
# unet.position_net = PositionNet(32,32)
scheduler = scheduler_cls(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
**scheduler_kwargs,
)
... @parameterized.expand(
[
(
"DDIMScheduler",
DDIMScheduler,
{
"clip_sample": False,
"set_alpha_to_one": False,
},
[0.5069, 0.5561, 0.4577, 0.4792, 0.5203, 0.4089, 0.5039, 0.4919, 0.4499],
),
(
"EulerAncestralDiscreteScheduler",
EulerAncestralDiscreteScheduler,
{},
[0.425, 0.494, 0.429, 0.469, 0.525, 0.417, 0.533, 0.5, 0.47],
),
]
)
def test_gligen(self, name, scheduler_cls, scheduler_kwargs: Dict, expected_slice: List):
device = "cpu" # ensure determinism for the device-dependent torch.Generator
components = self.get_dummy_components(scheduler_cls=scheduler_cls, scheduler_kwargs=scheduler_kwargs)
sd_pipe = StableDiffusionGLIGENPipeline(**components)
sd_pipe = sd_pipe.to(device)
sd_pipe.set_progress_bar_config(disable=None)
inputs = self.get_dummy_inputs(device)
image = sd_pipe(**inputs).images
image_slice = image[0, -3:, -3:, -1]
assert image.shape == (1, 64, 64, 3)
expected_slice = np.array(expected_slice)
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2 |
|
Moreover, @sayakpaul . I have accidentally perform a |
|
@rchuzh99 yeah seems like the PR is borked. It would make sense to open a separate one, instead. |
|
@sayakpaul , okok noted 👍🏻 |
|
Fix commit history with new PR |
|
Reopened at #5305 |
What does this PR do?
Fixes #5216This PR fixes the TypeError caused by trying to directly iterate over a 0-dimension tensor in the denoising stage of GLIGEN In-painting operation.
The error occurs when using diffusion noise schedulers that iterate over timesteps
(e.g. EulerAnchestralDiscreteScheduler, KDPM2AncestralDiscreteScheduler), during in-painting operation with the StableDiffusionGLIGENPipeline and StableDiffusionGLIGENTextImagePipeline .
For further clarification, this operation of the
add_noisefunction 🔽diffusers/src/diffusers/schedulers/scheduling_euler_ancestral_discrete.py
Lines 387 to 388 in ae2fc01
timestepsto be a non-0 dim torch Tensor. However, in the affected pipelines,timestepsis 0-dimension.This PR references the approach found in the StableDiffusionInpaintingPipeline
diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py
Lines 1029 to 1034 in ae2fc01
t) 0-d tensor in a list to convert to 1-d tensor as follow 🔽https://github.com/rchuzh99/diffusers/blob/fb82fc4bdcead457e24a780cfb193070227f3e31/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_gligen.py#L799-L806 and https://github.com/rchuzh99/diffusers/blob/fb82fc4bdcead457e24a780cfb193070227f3e31/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_gligen_text_image.py#L960-L967
Affected pipelines
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
cc: @sayakpaul , @nikhil-masterful, @tuanh123789
References