Thanks to visit codestin.com
Credit goes to github.com

Skip to content

mailcorahul/f4-its

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

f4-its

F4-ITS: Fine-grained Feature Fusion for Food Image-Text Search is a training-free, vision-language model (VLM)-guided framework that significantly improves retrieval performance through enhanced multi-modal feature representations. Our approach introduces two key contributions: (1) a uni-directional(and bi-directional) multi- modal fusion strategy that combines image embeddings with VLM-generated textual descriptions to improve query expressiveness, and (2) a novel feature-based re-ranking mechanism for top-k retrieval, leveraging predicted food ingredients to refine results and boost precision.

Task 1: Single Image-Text Retrieval - Fusion Architecture

F4-ITS Fusion Architecture

Task 2: topk Retrieval - Fusion + Reranking

F4-ITS Reranker

Evaluation Metrics

Evaluation Metrics

F4-ITS Performance under different fusion settings

Fusion Metrics

About

F4-ITS: Fine-grained Feature Fusion for Food Image-Text Search

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages