Please refer to instruction to install Lmms-Eval.
If you have already installed Lmms-Eval, you can copy the task repository "./lmms_eval/tasks/longhallqa" to the same location of your project("./lmms_eval/tasks/").
# Running Evaluation for specific MLLM and Task
python3 -m accelerate.commands.launch \
--num_processes=1 \
-m lmms_eval \
--model /MODEL/NAME \
--model_args pretrained=/PRETRAIN/CHECKPOINTS/PARAMETERS \
--tasks longhalqa \
--batch_size 1 \
--log_samples \
--log_samples_suffix /SAVE/SUFFIX \
--output_path ./logs/
# An example code of running LLaVA1.5-7b on Hallucination Completion task on LongHallQA is as follows:
python3 -m accelerate.commands.launch \
--num_processes=1 \
-m lmms_eval \
--model llava \
--model_args pretrained="liuhaotian/llava-v1.5-7b" \
--tasks longhalqa \
--batch_size 1 \
--log_samples \
--log_samples_suffix llava_v15_7b_lhqa_completion \
--output_path ./logs/| MLLM | model | model_args (pretrained=) |
|---|---|---|
| MiniCPM-V-2 | minicpm_v | "openbmb/MiniCPM-V-2" |
| Qwen2-VL-2B | qwen2_vl | ""Qwen/Qwen2-VL-2B-Instruct" |
| Fuyu | fuyu | "adept/fuyu-8b" |
| LLaVA-1.5-7b | llava | "liuhaotian/llava-v1.5-7b" |
| LLaVA-1.5-13b | llava | "liuhaotian/llava-v1.5-13b" |
| LLaVA-1.6-7b | llava | "liuhaotian/llava-v1.6-mistral-7b,conv_template=mistral_instruct" |
| Qwen-VL-Chat | qwen_vl_chat | "Qwen/Qwen-VL-Chat" |
| LLaVA-1.6-34b | llava | "liuhaotian/llava-v1.6-34b,conv_template=mistral_direct" |
| Qwen2-VL-72B | qwen2_vl | "Qwen/Qwen2-VL-72B-Instruct" |
| Hallucinaiton Discrimination | Hallucination Completion |
|---|---|
| lhqa_discrim_object_binary | lhqa_complete_description |
| lhqa_discrim_description_binary | lhqa_complete_conversation |
| lhqa_discrim_conversation_binary | |
| lhqa_discrim_description_choice | |
| lhqa_discrim_conversation_choice |
We fork and modify lmms-eval to employ LongHalQA. Thanks to this wonderful project.