Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Anima] Add img2img pipeline blocks#13929

Open
PreethamNoelP wants to merge 3 commits into
huggingface:mainfrom
PreethamNoelP:anima-img2img
Open

[Anima] Add img2img pipeline blocks#13929
PreethamNoelP wants to merge 3 commits into
huggingface:mainfrom
PreethamNoelP:anima-img2img

Conversation

@PreethamNoelP

Copy link
Copy Markdown

What does this PR do?

This PR adds image-to-image (img2img) support to the Anima modular pipeline, as requested in #13903. It introduces three new blocks following the same conventions as the existing AnimaAutoBlocks text-to-image pipeline: the timestep schedule is computed at the top level, the input image is encoded and noised inside the denoise sequence, and the final assembly is exposed as AnimaImg2ImgAutoBlocks.

New blocks added:

  • AnimaImg2ImgSetTimestepsStep — computes the full timestep schedule without resetting begin_index to 0, leaving the strength-based offset to be applied by the VAE encoder step downstream
  • AnimaImg2ImgVaeEncoderStep — encodes the input image with the VAE, slices the timestep schedule by strength via get_timesteps(), and adds noise using scheduler.scale_noise()
  • AnimaImg2ImgAutoBlocks — top-level SequentialPipelineBlocks assembly for the img2img workflow, requiring prompt and image

Architecture note

AnimaImg2ImgVaeEncoderStep is placed inside AnimaImg2ImgCoreDenoiseStep, after AnimaTextInputStep, rather than at the top level of AnimaImg2ImgAutoBlocks. AnimaTextInputStep writes batch_size (the number of prompts) to pipeline state, and the VAE encoder needs this value to expand image latents to batch_size * num_images_per_prompt before adding noise. Placing the encoder at the top level causes a tensor shape mismatch — latents are shaped for num_images_per_prompt only, while prompt embeddings are already expanded to batch_size * num_images_per_prompt.

Test results

pytest tests/modular_pipelines/anima/test_modular_pipeline_anima_img2img.py -q
13 passed, 5 skipped

Fixes #13903

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a GitHub issue or the forum? Link: [Anima] Add img2img capability #13903
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@asomoza @yiyixuxu

@github-actions github-actions Bot added documentation Improvements or additions to documentation tests modular-pipelines utils fixes-issue size/L PR with diff > 200 LOC labels Jun 12, 2026
@asomoza

asomoza commented Jun 15, 2026

Copy link
Copy Markdown
Member

thanks @PreethamNoelP, can we start with updating the PR with a code snippet to test it and the images to see that it works before doing the review.

@PreethamNoelP

Copy link
Copy Markdown
Author

Hi @asomoza, here is the code snippet and output images.

Usage:

import torch
from diffusers import AnimaModularPipeline
from diffusers.modular_pipelines.anima import AnimaImg2ImgAutoBlocks

pipe = AnimaModularPipeline(
    blocks=AnimaImg2ImgAutoBlocks(),
    pretrained_model_name_or_path="circlestone-labs/Anima-Base-v1.0-Diffusers",
)
pipe.load_components(torch_dtype=torch.bfloat16)
pipe.to("cuda")

image = pipe(
    prompt="masterpiece, best quality, mountain landscape at golden hour, snow-capped peaks, pine forest, dramatic clouds, cinematic, detailed",
    image=input_image,
    strength=0.75,
    num_inference_steps=28,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

Outputs:
side_by_side_comparison

Comment on lines +432 to +433
# Copied from diffusers.modular_pipelines.anima.before_denoise.AnimaSetTimestepsStep
class AnimaImg2ImgSetTimestepsStep(ModularPipelineBlocks):

@asomoza asomoza Jun 17, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you used the copied from here but you changed the function no? You can catch these errors and the code quality by following the contributing doc

@PreethamNoelP PreethamNoelP Jun 19, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the # Copied from tag above AnimaImg2ImgSetTimestepsStep.
Since the class intentionally skips the set_begin_index(0) call (that's handled later by the VAE encoder step), it was never truly identical to the source, so the tag shouldn't have been there.
Please let me know if any further changes are required.



# auto_docstring
class AnimaImg2ImgAutoBlocks(SequentialPipelineBlocks):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not need a AnimaImg2ImgAutoBlocks. We add a workflow into the AnimaAutoBlocks

see example https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/z_image/modular_blocks_z_image.py#L325

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed AnimaImg2ImgAutoBlocks entirely and folded img2img into AnimaAutoBlocks following the z_image pattern.
I added AnimaAutoDenoiseStep (an AutoPipelineBlocks that picks between img2img and txt2img based on whether image is provided) and AnimaImg2ImgDenoiseStep (a SequentialPipelineBlocks that wraps the img2img-specific set_timesteps + denoise steps).
AnimaAutoBlocks now has a _workflow_map covering both workflows, so users only need one pipeline - passing image= automatically triggers img2img.
Please let me know if any further changes are required.

…nto AnimaAutoBlocks

- Remove incorrect `# Copied from` comment above AnimaImg2ImgSetTimestepsStep
- Delete AnimaImg2ImgAutoBlocks; introduce AnimaAutoDenoiseStep (AutoPipelineBlocks)
  and AnimaImg2ImgDenoiseStep (SequentialPipelineBlocks) so img2img lives as a
  workflow inside AnimaAutoBlocks, following the z_image pattern
- Update __init__.py, dummy_objects, and docs to remove AnimaImg2ImgAutoBlocks
- Update img2img test to use AnimaAutoBlocks with updated workflow block paths
@github-actions github-actions Bot removed documentation Improvements or additions to documentation utils labels Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Anima] Add img2img capability

3 participants