Flow-Factory

Easy Reinforcement Learning for Diffusion and Flow-Matching Models

🔥 News

[2026-02-01] Support for multiple Attention Backends! You can now optimize memory and speed by setting the attn_backend parameter in your config:

  model:
      attn_backend: "flash" # Options: "native", "xformers", "flash_hub", "_flash_3_hub", "_flash_3_varlen_hub"

This experimental feature leverages diffusers's transformer.set_attention_backend. Check the official diffusers documentation for all available options.

We recommend installing the kernels package (pip install kernels) and using flash_hub, flash_varlen_hub, _flash_3_hub, or _flash_3_varlen_hub to avoid the complexity and potential incompatibility of installing Flash-Attention directly.

[2026-01-17] We have added the latest FLUX2-Klein series! Follow the commands to start:

# Clone the repo with submodule `diffusers`
git clone --recursive https://github.com/X-GenGroup/Flow-Factory.git
cd Flow-Factory
# Fetch the source code of `diffusers==0.37.0.dev`
git submodule update --init --recursive
# Install `diffusers==0.37.0.dev`
cd diffusers
pip install -e .
# Install Flow-Factory
cd ..
pip install -e .

📕 Table of Contents

🤗 Supported Models

Task	Model	Model Size	Model Type
Text-to-Image	FLUX.1-dev	13B	flux1
	Z-Image-Turbo	12B	z-image
	Qwen-Image	20B	qwen-image
	Qwen-Image-2512	20B	qwen-image
Image-to-Image	FLUX.1-Kontext-dev	13B	flux1-kontext
Image(s)-to-Image	Qwen-Image-Edit-2509	20B	qwen-image-edit-plus
Image(s)-to-Image	Qwen-Image-Edit-2511	20B	qwen-image-edit-plus
Text-to-Image & Image(s)-to-Image	FLUX.2-dev	30B	flux2
	FLUX.2-klein-4B	4B	flux2-klein
	FLUX.2-klein-9B	9B	flux2-klein
	FLUX.2-klein-base-4B	4B	flux2-klein
	FLUX.2-klein-base-9B	9B	flux2-klein
Text-to-Video	Wan2.1-T2V-1.3B	1.3B	wan2_t2v
	Wan2.1-T2V-14B	14B	wan2_t2v
	Wan2.2-TI2V-5B	5B	wan2_t2v
	Wan2.2-T2V-A14B	A14B	wan2_t2v
Image-to-Video	Wan2.1-I2V-14B-480P	14B	wan2_i2v
	Wan2.1-I2V-14B-480P	14B	wan2_i2v
	Wan2.1-I2V-14B-720P	14B	wan2_i2v
	Wan2.2-TI2V-5B	5B	wan2_i2v
	Wan2.2-I2V-A14B	A14B	wan2_i2v

To support new models, see Guidance/New Model.

💻 Supported Algorithms

Algorithm	`trainer_type`
GRPO	grpo
GRPO-Guard	grpo-guard
DiffusionNFT	nft
AWM	awm

See Algorithm Guidance for more information.

Model and algorithm are fully decoupled in Flow-Factory, enabling all listed model–algorithm combinations to work out of the box. The configurations under examples/ have been verified to yield measurable performance gains. For unlisted combinations, find the closest (task, algorithm) config and swap in the desired model or algorithm parameters.

💾 Hardward Requirements

🚀 Get Started

Installation

git clone https://github.com/Jayce-Ping/Flow-Factory.git
cd Flow-Factory
pip install -e .

Optional dependencies, such as deepspeed, are also available. Install them with:

pip install -e .[deepspeed]

Experiment Trackers

To use Weights & Biases or SwanLab to log experimental results, install extra dependencies via pip install -e .[wandb] or pip install -e .[swanlab].

After installation, set corresponding arguments in the config file:

run_name: null  # Run name (auto: {model_type}_{finetune_type}_{trainer_type}_{timestamp})
project: "Flow-Factory"  # Project name for logging
logging_backend: "wandb"  # Options: wandb, swanlab, tensorboard, none

These trackers allow you to visualize both training samples and metric curves online:

Quick Start Example

Start training with the following simple command:

ff-train examples/grpo/lora/flux1.yaml

📖 Guidance

We provide a set of guidance documents to help you understand the framework and extend it. For a comprehensive understanding of the framework's design and motivation, refer to our technique report.

Document	Description
Workflow	End-to-end training pipeline: the overall stages from data preprocessing to policy optimization
Algorithms	Supported RL algorithms (GRPO, GRPO-Guard, DiffusionNFT, AWM) and their configurations
Rewards	Reward model system: built-in models, custom rewards, and remote reward servers
New Model	How to add support for a new Diffusion/Flow-Matching model

📊 Dataset

The unified structure of dataset is:

|---- dataset
|----|--- train.txt / train.jsonl
|----|--- test.txt / test.jsonl (optional)
|----|--- images (optional)
|----|---| image1.png
|----|---| ...
|----|--- videos (optional)
|----|---| video1.mp4
|----|---| ...

Text-to-Image & Text-to-Video

For text-to-image and text-to-video tasks, the only required input is the prompt in plain text format. Use train.txt and test.txt (optional) with following format:

A hill in a sunset.
An astronaut riding a horse on Mars.

Example: dataset/pickscore

Each line represents a single text prompt. Alternatively, you can use train.jsonl and test.jsonl in the following format:

{"prompt": "A hill in a sunset."}
{"prompt": "An astronaut riding a horse on Mars."}

Example: dataset/t2is

negative_prompt is also supported:

{"prompt": "A hill in a sunset.", "negative_prompt": "low quality, blurry, distorted, poorly drawn"}
{"prompt": "An astronaut riding a horse on Mars.", "negative_prompt": "low quality, blurry, distorted, poorly drawn"}

Example: dataset/t2is_neg

Image-to-Image & Image-to-Video

For tasks involving conditioning images, use train.jsonl and test.jsonl in the following format:

{"prompt": "A hill in a sunset.", "image": "path/to/image1.png"}
{"prompt": "An astronaut riding a horse on Mars.", "image": "path/to/image2/png"}

Example: dataset/sharegpt4o_image_mini

The default root directory for images is dataset_dir/images, and for videos, it is dataset_dir/videos. You can override these locations by setting the image_dir and video_dir variables in the config file:

data:
    dataset_dir: "path/to/dataset"
    image_dir: "path/to/image_dir" # (default to "{dataset_dir}/images")
    video_dir: "path/to/video_dir" # (default to "{dataset_dir}/videos")

For models like FLUX.2-dev and Qwen-Image-Edit-2511 that are able to accept multiple images as conditions, use the images key with a list of image paths:

{"prompt": "A hill in a sunset.", "images": ["path/to/condition_image_1_1.png", "path/to/condition_image_1_2.png"]}
{"prompt": "An astronaut riding a horse on Mars.", "images": ["path/to/condition_image_2_1.png", "path/to/condition_image_2_2.png"]}

Video-to-Video

{"prompt": "A hill in a sunset.", "video": "path/to/video1.mp4"}
{"prompt": "An astronaut riding a horse on Mars.", "videos": ["path/to/video2.mp4", "path/to/video3.mp4"]}

💯 Reward Model

Flow-Factory provides a flexible reward model system that supports both built-in and custom reward models for reinforcement learning.

Reward Model Types

Flow-Factory supports two types of reward models:

Pointwise Reward: Computes independent scores for each sample (e.g., aesthetic quality, text-image alignment).
Pairwise Reward: Computes rewards based on the pairwise comparison within the group. This is a special case of the following Groupwise Reward.
Groupwise Reward: Computes rewards that requires the all samples in a group (e.g., ranking-based score or pairwise comparison).

Built-in Reward Models

The following reward models are pre-registered and ready to use:

Name	Type	Description	Reference
`PickScore`	Pointwise	CLIP-based aesthetic scoring model	PickScore
`PickScore_Rank`	Groupwise	Ranking-based reward using PickScore	PickScore
`CLIP`	Pointwise	Image-text cosine similarity	CLIP

Using Built-in Reward Models

Simply specify the reward model name in your config file:

rewards:
  name: "aesthetic" # Alias for this reward model
  reward_model: "PickScore" # Reward model type or a path like 'my_package.rewards.CustomReward'
  batch_size: 16
  device: "cuda"
  dtype: bfloat16

Refer to Rewards Guidance for more information about advanced usage, such as creating a custom reward model.

🤗 Acknowledgements

This repository is based on diffusers, accelerate and peft. We thank them for their contributions to the community!!!

📝 Citation

If you find Flow-Factory useful in your research, please consider citing our paper:

@article{ping2026flowfactory,
  title={Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models}, 
  author={Bowen Ping and Chengyou Jia and Minnan Luo and Hangwei Qian and Ivor Tsang},
  journal={arXiv preprint arXiv:2602.12529},
  year={2026},
  url={https://arxiv.org/abs/2602.12529}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 454 Commits
assets		assets
config		config
dataset		dataset
diffusers @ 769a1f3		diffusers @ 769a1f3
examples		examples
guidance		guidance
inference		inference
reward_server		reward_server
src/flow_factory		src/flow_factory
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flow-Factory

🔥 News

📕 Table of Contents

🤗 Supported Models

💻 Supported Algorithms

💾 Hardward Requirements

🚀 Get Started

Installation

Experiment Trackers

Quick Start Example

📖 Guidance

📊 Dataset

Text-to-Image & Text-to-Video

Image-to-Image & Image-to-Video

Video-to-Video

💯 Reward Model

Reward Model Types

Built-in Reward Models

Using Built-in Reward Models

🤗 Acknowledgements

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

X-GenGroup/Flow-Factory

Folders and files

Latest commit

History

Repository files navigation

Flow-Factory

🔥 News

📕 Table of Contents

🤗 Supported Models

💻 Supported Algorithms

💾 Hardward Requirements

🚀 Get Started

Installation

Experiment Trackers

Quick Start Example

📖 Guidance

📊 Dataset

Text-to-Image & Text-to-Video

Image-to-Image & Image-to-Video

Video-to-Video

💯 Reward Model

Reward Model Types

Built-in Reward Models

Using Built-in Reward Models

🤗 Acknowledgements

📝 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages