Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval (NeurIPS 2024)

Dvir Samuel, Rami Ben-Ari, Matan Levy, Nir Darshan, Gal Chechik
Bar Ilan University, The Hebrew University of Jerusalem, NVIDIA Research

Personalized retrieval and segmentation aim to locate specific instances within a dataset based on an input image and a short description of the reference instance. While supervised methods are effective, they require extensive labeled data for training. Recently, self-supervised foundation models have been introduced to these tasks showing comparable results to supervised methods. However, a significant flaw in these models is evident: they struggle to locate a desired instance when other instances within the same class are presented. In this paper, we explore text-to-image diffusion models for these tasks. Specifically, we propose a novel approach called PDM for Personalized Diffusion Features Matching, that leverages intermediate features of pre-trained text-to-image models for personalization tasks without any additional training. PDM demonstrates superior performance on popular retrieval and segmentation benchmarks, outperforming even supervised methods. We also highlight notable shortcomings in current instance and segmentation datasets and propose new benchmarks for these tasks.

Personalized segmentation task involves segmenting a specific reference object in a new scene. Our method is capable to accurately identify the specific reference instance in the target image, even when other objects from the same class are present. While other methods capture visually or semantically similar objects, our method can successfully extract the identical instance, by using a new personalized feature map and fusing semantic and appearance cues. Red and green indicate incorrect and correct segmentations respectively.

Requirements

Quick installation using pip:

torch==2.0.1
torchvision==0.15.2
diffusers==0.18.2
transformers==4.32.0.dev0

Personalized Diffusion Features Matching (PDM)

To run PDM visualization between two images run the following:

python pdm_matching.py

PerMIR and PerMIS Datasets

The PerMIR and PerMIS datasets were sourced from the BURST repository.

Instructions:

Download the datasets from the BURST repository. Place train,val, and test sets in the same directory.
Run the script PerMIRS/permirs_gen_dataset.py to prepare the personalization datasets. Ensure --images_base_dir contains the downloaded BURST splits. Additionally, set --annotations_file to all_classes.json.
Execute PerMIRS/extract_diff_features.py to extract PDM and DIFT features from each image in the dataset.

Evaluation on PerMIR

For PDM evaluation on PerMIR dataset (personalized retrieval) run:

python pdm_permir.py

Evaluation on PerMIS

For PDM evaluation on PerMIS dataset (personalized segmentation) run:

python pdm_permis.py

Cite Our Paper

If you find our paper and repo useful, please cite:

@article{Samuel2024Waldo,
  title={Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval},
  author={Dvir Samuel and Rami Ben-Ari and Matan Levy and Nir Darshan and Gal Chechik},
  journal={NeurIPS},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
PerMIRS		PerMIRS
dogs		dogs
README.md		README.md
StableDiffusionPipelineWithDDIMInversion.py		StableDiffusionPipelineWithDDIMInversion.py
attention_store.py		attention_store.py
dift.py		dift.py
pdm_matching.py		pdm_matching.py
pdm_permir.py		pdm_permir.py
pdm_permis.py		pdm_permis.py
ptp_utils.py		ptp_utils.py
sam_utils.py		sam_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval (NeurIPS 2024)

Requirements

Personalized Diffusion Features Matching (PDM)

PerMIR and PerMIS Datasets

Instructions:

Evaluation on PerMIR

Evaluation on PerMIS

Cite Our Paper

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

dvirsamuel/PDM

Folders and files

Latest commit

History

Repository files navigation

Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval (NeurIPS 2024)

Requirements

Personalized Diffusion Features Matching (PDM)

PerMIR and PerMIS Datasets

Instructions:

Evaluation on PerMIR

Evaluation on PerMIS

Cite Our Paper

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages