Thanks to visit codestin.com
Credit goes to github.com

Skip to content

jinuk0211/CSR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Step 1. Construct Preference Data.

First, prepare the COCO-2014 train images in the './data/images/'. Then complete the following steps in sequence.

cd ./CSR/inference_csr
bash ./step1.sh
bash ./step2.sh
bash ./step3.sh

You now have the preference dataset. This process takes a long time. We provide our preference datasets in huggingface.

Step 2. Direct Preference Optimization (DPO).

bash ./CSR/scripts/run_train.sh

Step 3. Iterative Learning.

After completing a round of CSR training, you need to merge the current LoRA checkpoint. Use the merged checkpoint as the base model and proceed with Step 1 and Step 2 sequentially.

python ./scripts/merge_lora_weights.py --model-path "your LoRA checkpoint path" --model-base "your llava 1.5 checkpoint path --> your Iter-1 path --> your Iter-2 path ...." --save-model-path "xxx"

About

LMM, ReST, infernece scaling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 88.0%
  • HTML 9.6%
  • Shell 2.4%