Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[ICCV'23] HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Notifications You must be signed in to change notification settings

thisisWooyeol/HRSBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

"HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models"

Eslam Abdelrahman, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, and Mohamed Elhoseiny

Website paper video

This is the forked version of HRSBench, modified from Attention-Refocusing paper. User can benchmark both text-to-image and box layout-to-image generation tasks. For box layout-to-image tasks, GPT4 generated layouts which are provided by Attention-Refocusing repository are used.



About HRS dataset

The core functionality of HRS-Bench are included in src directory, majorly copied from Attention-Refocusing.

With the HRS prompts and GPT-4 generated box layouts from Attention-Refocusing, I pre-processed the complete dataset into hrs_dataset in jsonl format.

The dataset includes four main categories: spatial relationship, color, size, and counting. Even though HRS prompts for each category counting/spatial/size/color are 3, 000/1,002/501/501, there are some duplicated prompts. If we count only unique prompts, the numbers are 2,990/898/424/484.

And there are also mysteriously missing prompts in *.p pickle files, last lines for all datasets. Therefore, 2,990/896/423/483 unique prompts can be evaluated.

How to evaluate

Setup virtual environment first:

uv sync
source .venv/bin/activate

Download UniDet weight and MaskDino weight with following code:

gdown 110JSpmfNU__7T3IMSJwv0QSfLLo_AqtZ
wget https://github.com/IDEA-Research/detrex-storage/releases/download/maskdino-v0.1.0/maskdino_swinl_50ep_300q_hid2048_3sd1_instance_maskenhanced_mask52.3ap_box59.0ap.pth

Step-1: Generate Images with HRS prompts (and optionally with box layouts)

The HRS Bench dataset is structured to support benchmarking for both text-to-image (T2I) and layout-to-image (L2I) tasks. The dataset includes prompts and corresponding box layouts for each image. Users can choose to generate images using either the prompts alone or the prompts in conjunction with the box layouts.

All dataset specifications, including prompt formats and layout details, are provided in hrs_dataset/ in jsonl format. Here, we will show core components that users need to be aware of when working with the dataset.

  • prompt: The text prompt used for image generation.
  • phrases: Simple descriptions of its corresponding box layout.
  • bounding_boxes: The box layout information, 0-1 scale tuple4 (x_min, y_min, x_max, y_max)

ℹ️ Additional information for ISAC

  • about tags

    • expected_obj1 to expected_obj4 (maximum) provides prompt including object tags for each image. These tags are crucial for tag based Attention Modulation.
  • about instance count

    • For spatial, size and color tasks, only one instance is presented per each object category. This means, n=1 is the default setting for these tasks.
    • For counting tasks, multiple instances may be presented for maximum two object categories. You may refer expected_n1 and expected_n2 to retrieve instance counts. (GT values)

Naming rule for HRS dataset

Each generated image for a specific task should be saved in a separate folder, sharing the same parent directory. And generated images should follow the naming convention: <prompt_idx>_<level>_<prompt>.[png|jpg] For example:

/path/to/IMAGE_ROOT/
├── color_seed42/
├── counting_seed42/
├── size_seed42/
└── spatial_seed42/

Step-2: Evaluate Generated Images

run_hrs_benchmark.sh provides an easy way to evaluate the generated images against the HRS dataset. This script will automatically run the evaluation process, saving the results in a specified output directory.

More details about the intermediate process can be found in README.md in src directory.

Here is the way to run the benchmark script:

bash run_hrs_benchmark.sh <METHOD_NAME> <IMAGE_ROOT> <GENERATION_SEED>

where

  • <METHOD_NAME>: The name of the method to use for evaluation. This will be used for creating output directory name (e.g., SD1.5).
  • <IMAGE_ROOT>: The root directory containing the generated images for evaluation.
  • <GENERATION_SEED>: The seed used for image generation, which helps in reproducing results.

About

[ICCV'23] HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Resources

Stars

Watchers

Forks