Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CSU-JPG/Chart2Code.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

agent

From charts to code: a hierarchical benchmark for multimodal models

Welcome to Chart2Code! If you find this repo useful, please give a star ⭐ for encouragement.

arXiv Project Page Code Hugging Face Dataset

🌟 Overview

Chart2Code-Benchmark is a new benchmark designed to evaluate chart generation capabilities of LMMs under progressively challenging conditions.

agent

Chart2Code covers three progressively challenging levels: reproduction, editing, and long-table to chart generation.
Level1(Chart Reproduction) reproduces charts from a reference figure and user query;
Level2(Chart Editing) involves complex modifications such as changing chart types or adding elements;
Level3(Long-Table to Chart Generation) requires models to transform long, information-dense tables into faithful charts following user instructions.

More details about Chart2Code in project page.🌐

🚀 Quick Start

Here we provide a quick start guide to evaluate LMMs on Chart2Code.

Setup Environment

git clone https://github.com/showlab/Chart2Code.git
conda env create -f environment.yaml
conda activate chart2code
cd Chart2Code

Setup API key and API base URL in .env for different LMMs. Claude、 Gemini and GPT are accessed through API proxy providers,while Seed is accessed through ARK API.

OPENAI_API_KEY=${your_api_proxy_provider_key}
ARK_API_KEY=${your_ark_api_key}
OPENAI_API_URL=${your_api_proxy_provider_url}
ARK_BASE_URL=${your_ark_api_base_url}

Download Data

Download the Chart2Code data from Huggingface and unzip it under the root directory.

wget https://huggingface.co/datasets/CSU-JPG/Chart2Code/resolve/main/data.zip
unzip data.zip

The file structure should be like this:

├── data
│   ├── level1_direct
│   │   ├── 3d_1.png
│   │   ├── 3d_1.py
│   │   └── ...
│   ├── level1_figure
│   │   ├── fig1_density_2
│   │   ├── ...
│   └── level1_customize
│       ├── table_1_instruction_2.png
│       ├── table_1_instruction_2.py
│       ├── table_1_instruction_2_request.txt
│       └── table_1_instruction_2_data.txt
│       └── ...
│   ├── level2
│   │   ├── bar_1_v1.png
│   │   ├── bar_1_v1.py
│   │   ├── bar_1_v1_data.txt
│   │   └── ...
│   └── level3
│       ├── table_1.xlsx
│       ├── table1_1.png
│       ├── table1_1_generate.py
│       ├── table1_1.txt
│       ├── table1_1_generate.png
│       └── ...
│   ├── level1_direct.json
│   ├── level1_figure.json
│   ├── level1_customize.json
│   ├── level2.json
│   └── level3.json
│—— Evaluation
└── ...

Inference Setup

Inference for each benchmark level is handled by a dedicated shell script located in the scripts/ directory. You must specify a model for each run. You can do this in two ways:

  • Pass it as an argument (Recommended): Provide the MODEL_IDENTIFIER directly when executing the script.
  • Edit the script: Set the MODEL_IDENTIFIER variable inside the corresponding .sh file.

You can modify the LOAD_SOURCE parameter in the shell script to select how the model is loaded:

  • local: By default, the model will be loaded from the Inference/models directory.
  • hub: The model weights will be loaded directly from the Hugging Face Hub online.

You can also adjust other parameters like GPU_VISIBLE_DEVICES in the script to fit your hardware setup.

cd scripts/inference
# For level1_customize
bash inference_customize.sh qwen3_customize_30B
# For level1_direct
bash inference_direct.sh qwen2.5_direct_72B
# For level1_figure
bash inference_figure.sh InternVL_3.5_figure_38B
# For level2
bash inference_level2.sh deepseek_level2
# For level3
bash inference_level3.sh gpt_5_level3
Available Models We now support the following models:
Model Name MODEL_IDENTIFIER
level1_customize level1_direct level1_figure level2 level3
InternVL-3.5-38B InternVL_3.5_customize_38B InternVL_3.5_direct_38B InternVL_3.5_figure_38B InternVL_3.5_level2_38B InternVL_3.5_level3_38B
InternVL-3.5-8B InternVL_3.5_customize_8B InternVL_3.5_direct_8B InternVL_3.5_figure_8B InternVL_3.5_level2_8B InternVL_3.5_level3_8B
InternVL-3-38B InternVL_3_customize_38B InternVL_3_direct_38B InternVL_3_figure_38B InternVL_3_level2_38B InternVL_3_level3_38B
InternVL-3-8B InternVL_3_customize_8B InternVL_3_direct_8B InternVL_3_figure_8B InternVL_3_level2_8B InternVL_3_level3_8B
InternVL-2.5-38B InternVL_2.5_customize_38B InternVL_2.5_direct_38B InternVL_2.5_figure_38B InternVL_2.5_level2_38B InternVL_2.5_level3_38B
InternVL-2.5-8B InternVL_2.5_customize_8B InternVL_2.5_direct_8B InternVL_2.5_figure_8B InternVL_2.5_level2_8B InternVL_2.5_level3_8B
Qwen3-VL-30B qwen3_customize_30B qwen3_direct_30B qwen3_figure_30B qwen3_level2_30B qwen3_level3_30B
Qwen3-VL-30B-think qwen3_customize_30B_think qwen3_direct_30B_think qwen3_figure_30B_think qwen3_level2_30B_think qwen3_level3_30B_think
Qwen2.5-VL-72B qwen2.5_customize_72B qwen2.5_direct_72B qwen2.5_figure_72B qwen2.5_level2_72B qwen2.5_level3_72B
Qwen2.5-VL-7B qwen2.5_customize_7B qwen2.5_direct_7B qwen2.5_figure_7B qwen2.5_level2_7B qwen2.5_level3_7B
Qwen2-VL-72B qwen2_customize_72B qwen2_direct_72B qwen2_figure_72B qwen2_level2_72B qwen2_level3_72B
Qwen2-VL-7B qwen2_customize_7B qwen2_direct_7B qwen2_figure_7B qwen2_level2_7B qwen2_level3_7B
MOLMO-7B-D molmo_customize_7BD molmo_direct_7BD molmo_figure_7BD molmo_level2_7BD molmo_level3_7BD
MIMO-VL-7B-RL-think mimo_RL_customize_think mimo_RL_direct_think mimo_RL_figure_think mimo_RL_level2_think mimo_RL_level3_think
MIMO-VL-7B-RL-nothink mimo_RL_customize_nothink mimo_RL_direct_nothink mimo_RL_figure_nothink mimo_RL_level2_nothink mimo_RL_level3_nothink
MIMO-VL-7B-SFT-nothink mimo_SFT_customize_nothink mimo_SFT_direct_nothink mimo_SFT_figure_nothink mimo_SFT_level2_nothink mimo_SFT_level3_nothink
MIMO-VL-7B-SFT-think mimo_SFT_customize_think mimo_SFT_direct_think mimo_SFT_figure_think mimo_SFT_level2_think mimo_SFT_level3_think
LLaVA-OV-Qwen2-7B-OV llava_ov_customize llava_ov_direct llava_ov_figure liava_ov_level2 llava_ov_level3
LLaVA-OV-Qwen2-7B-SI llava_si_customize llava_si_direct llava_si_figure llava_si_level2 llava_si_level3
SEED-1.6-VL seed_1.6_customize seed_1.6_direct seed_1.6_figure seed_1.6_level2 seed_1.6_level3
SEED-1.5-VL seed_1.5_customize seed_1.5_direct seed_1.5_figure seed_1.5_level2 seed_1.5_level3
Claude-Sonnet-4 claude_customize claude_direct claude_figure claude_level2 claude_level3
DeepSeek-VL-7B deepseek_customize deepseek_direct deepseek_figure deepseek_level2 deepseek_level3
Gemini-2.5-Pro gemini_2.5_customize gemini_2.5_direct gemini_2.5_figure gemini_2.5_level2 gemini_2.5_level3
GLM-4V-9B glm_customize glm_direct glm_figure glm_level2 glm_level3
GPT-5 gpt_5_customize gpt_5_direct gpt_5_figure gpt_5_level2 gpt_5_level3
Kimi-VL-A3B kimi_customize kimi_direct kimi_figure kimi_level2 kimi_level3

Evaluate Setup

For the results obtained from inference, the first step is to check the execution rate. The code that runs successfully and its corresponding generated images will undergo the following evaluations: base_evaluation, LLM_evaluation, and LMM_evaluation.

cd scripts/evaluate
# step1: check execution rate
bash execute_evaluate.sh
# step2: run base evaluation
bash base_evaluator.sh
# step3: run LLM evaluation to evaluate the code
bash LLM_evaluator.sh
# step4: run LMM evaluation to evaluate the image
bash LMM_evaluator.sh

📢 Update

  • [2025.10.22] We release our paper in arxiv.

❤ Acknowledgement

  • Special thanks to Henry Hengyuan Zhao for serving as the Project Leader of this paper.

  • We are grateful to Lijian Wu and Ziyuan Zhen for their hard work in data annotation and baseline testing.

  • We also extend our appreciation to Mao Dongxing, Yifei Tao, Lijian Wu, and Wan Yang for their contributions to this work.

🎓 BibTeX

If you find ChartCode useful, please cite using this BibTeX:

@misc{tang2025chartscodehierarchicalbenchmark,
      title={From Charts to Code: A Hierarchical Benchmark for Multimodal Models}, 
      author={Jiahao Tang and Henry Hengyuan Zhao and Lijian Wu and Yifei Tao and Dongxing Mao and Yang Wan and Jingru Tan and Min Zeng and Min Li and Alex Jinpeng Wang},
      year={2025},
      eprint={2510.17932},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2510.17932}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published