First, prepare the COCO-2014 train images in the './data/images/'. Then complete the following steps in sequence.
cd ./CSR/inference_csr
bash ./step1.shbash ./step2.shbash ./step3.shYou now have the preference dataset. This process takes a long time. We provide our preference datasets in huggingface.
bash ./CSR/scripts/run_train.shAfter completing a round of CSR training, you need to merge the current LoRA checkpoint. Use the merged checkpoint as the base model and proceed with Step 1 and Step 2 sequentially.
python ./scripts/merge_lora_weights.py --model-path "your LoRA checkpoint path" --model-base "your llava 1.5 checkpoint path --> your Iter-1 path --> your Iter-2 path ...." --save-model-path "xxx"