Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[CVPR 2025] The official implementation of the paper "Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs"

Notifications You must be signed in to change notification settings

OoDBag/VideoMindPalace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

VideoMindPalace

[CVPR 2025] The official implementation of the paper "Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs"


βš™οΈ Environment Setup

# Create and activate the environment
conda create -n mindpalace python=3.9
conda activate mindpalace

# Install dependencies
pip install openai
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install pandas
pip install transformers==4.28.1
pip install accelerate

🧭 Full Pipeline Overview

πŸ“¦ 1. Preprocessing and Tracking Extraction (EgoSchema)

We use AMEGO's tracking pipeline to extract per-frame object trajectories from EgoSchema videos.

# Follow AMEGO's official instructions to obtain tracking outputs

🧱 2. Tracking Object Classification and clustering

python cluster_class.py
python cluster.py

πŸ“ 3. Caption Generation

python caption.py

πŸ•ΈοΈ 4. Graph Construction

python build_graph.py

❓ 5. Graph-based Question Answering

sh egoschema_qa.sh

πŸ“œ Citation

If you find this work useful, please consider citing:

@article{huang2025building,
  title={Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs},
  author={Huang, Zeyi and Ji, Yuyang and Wang, Xiaofang and Mehta, Nikhil and Xiao, Tong and Lee, Donghyun and Vanvalkenburgh, Sigmund and Zha, Shengxin and Lai, Bolin and Yu, Licheng and others},
  journal={arXiv preprint arXiv:2501.04336},
  year={2025}
}```

About

[CVPR 2025] The official implementation of the paper "Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •