Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ CoCA Public

This repository is the code implementation for our paper "Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning"

License

Notifications You must be signed in to change notification settings

Lil-Shake/CoCA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoCA

This is an official implementation of "Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning". CoCA is a contribution-based credit assignment algorithm that transforms trajectory-level rewards into step-level rewards at no additional cost in RL fine-tuning of T2I diffusion models.

Installation

cd CoCA
conda create -n coca python=3.10
pip install -e .

Training

We provide training scripts in the bash/ directory for fine-tuning on Aesthetic, HPSv2, ImageReward, and PickScore.

Fine-tuning SD-v1.5 on Aesthetic score using CoCA.

bash bash/train_coca_aesthetic.sh

Fine-tuning SD-v1.5 on HPSv2 score using CoCA.

bash bash/train_coca_hps.sh

Fine-tuning SD-v1.5 on ImageReward score using CoCA.

bash bash/train_coca_ir.sh

Fine-tuning SD-v1.5 on PickScore using CoCA.

bash bash/train_coca_pick.sh

We also provide training scripts for running comparison experiments on DDPO and UCA (Uniform Credit Assignment) method.

Fine-tuning SD-v1.5 on Aesthetic score using DDPO.

bash bash/train_ddpo_aesthetic.sh

Fine-tuning SD-v1.5 on Aesthetic score using UCA.

bash bash/train_uca_aesthetic.sh

Reward Curves

About

This repository is the code implementation for our paper "Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published