[Anima] Add img2img pipeline blocks#13929
Conversation
|
thanks @PreethamNoelP, can we start with updating the PR with a code snippet to test it and the images to see that it works before doing the review. |
|
Hi @asomoza, here is the code snippet and output images. Usage: |
| # Copied from diffusers.modular_pipelines.anima.before_denoise.AnimaSetTimestepsStep | ||
| class AnimaImg2ImgSetTimestepsStep(ModularPipelineBlocks): |
There was a problem hiding this comment.
you used the copied from here but you changed the function no? You can catch these errors and the code quality by following the contributing doc
There was a problem hiding this comment.
Removed the # Copied from tag above AnimaImg2ImgSetTimestepsStep.
Since the class intentionally skips the set_begin_index(0) call (that's handled later by the VAE encoder step), it was never truly identical to the source, so the tag shouldn't have been there.
Please let me know if any further changes are required.
|
|
||
|
|
||
| # auto_docstring | ||
| class AnimaImg2ImgAutoBlocks(SequentialPipelineBlocks): |
There was a problem hiding this comment.
we do not need a AnimaImg2ImgAutoBlocks. We add a workflow into the AnimaAutoBlocks
There was a problem hiding this comment.
Removed AnimaImg2ImgAutoBlocks entirely and folded img2img into AnimaAutoBlocks following the z_image pattern.
I added AnimaAutoDenoiseStep (an AutoPipelineBlocks that picks between img2img and txt2img based on whether image is provided) and AnimaImg2ImgDenoiseStep (a SequentialPipelineBlocks that wraps the img2img-specific set_timesteps + denoise steps).
AnimaAutoBlocks now has a _workflow_map covering both workflows, so users only need one pipeline - passing image= automatically triggers img2img.
Please let me know if any further changes are required.
…nto AnimaAutoBlocks - Remove incorrect `# Copied from` comment above AnimaImg2ImgSetTimestepsStep - Delete AnimaImg2ImgAutoBlocks; introduce AnimaAutoDenoiseStep (AutoPipelineBlocks) and AnimaImg2ImgDenoiseStep (SequentialPipelineBlocks) so img2img lives as a workflow inside AnimaAutoBlocks, following the z_image pattern - Update __init__.py, dummy_objects, and docs to remove AnimaImg2ImgAutoBlocks - Update img2img test to use AnimaAutoBlocks with updated workflow block paths

What does this PR do?
This PR adds image-to-image (
img2img) support to the Anima modular pipeline, as requested in #13903. It introduces three new blocks following the same conventions as the existingAnimaAutoBlockstext-to-image pipeline: the timestep schedule is computed at the top level, the input image is encoded and noised inside the denoise sequence, and the final assembly is exposed asAnimaImg2ImgAutoBlocks.New blocks added:
AnimaImg2ImgSetTimestepsStep— computes the full timestep schedule without resettingbegin_indexto 0, leaving the strength-based offset to be applied by the VAE encoder step downstreamAnimaImg2ImgVaeEncoderStep— encodes the input image with the VAE, slices the timestep schedule bystrengthviaget_timesteps(), and adds noise usingscheduler.scale_noise()AnimaImg2ImgAutoBlocks— top-levelSequentialPipelineBlocksassembly for theimg2imgworkflow, requiringpromptandimageArchitecture note
AnimaImg2ImgVaeEncoderStepis placed insideAnimaImg2ImgCoreDenoiseStep, afterAnimaTextInputStep, rather than at the top level ofAnimaImg2ImgAutoBlocks.AnimaTextInputStepwritesbatch_size(the number of prompts) to pipeline state, and the VAE encoder needs this value to expand image latents tobatch_size * num_images_per_promptbefore adding noise. Placing the encoder at the top level causes a tensor shape mismatch — latents are shaped fornum_images_per_promptonly, while prompt embeddings are already expanded tobatch_size * num_images_per_prompt.Test results
Fixes #13903
Before submitting
Who can review?
@asomoza @yiyixuxu