[CVPR 2025] The official implementation of the paper "Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs"
# Create and activate the environment
conda create -n mindpalace python=3.9
conda activate mindpalace
# Install dependencies
pip install openai
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install pandas
pip install transformers==4.28.1
pip install accelerateWe use AMEGO's tracking pipeline to extract per-frame object trajectories from EgoSchema videos.
# Follow AMEGO's official instructions to obtain tracking outputspython cluster_class.py
python cluster.pypython caption.pypython build_graph.pysh egoschema_qa.shIf you find this work useful, please consider citing:
@article{huang2025building,
title={Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs},
author={Huang, Zeyi and Ji, Yuyang and Wang, Xiaofang and Mehta, Nikhil and Xiao, Tong and Lee, Donghyun and Vanvalkenburgh, Sigmund and Zha, Shengxin and Lai, Bolin and Yu, Licheng and others},
journal={arXiv preprint arXiv:2501.04336},
year={2025}
}```