Framework to leverage Scene Graphs and GAT's to classify indoor scenes. Official implementation of "Attention over Scene Graphs: Indoor Scene Representations Toward CSAI Classification", accepted at the 1st Workshop on From Scene Understanding to Human Modeling at the BMVC 2025.
First, you must clone the repository:
git clone [email protected]:tutuzeraa/ASGRA.git ASGRA
cd ASGRA
git submodule update --init # to use the modified Pix2GrpFor installing the framework, run the following commands:
conda create -n ASGRA python=3.11 pytorch torchvision torchaudio -c pytorch -c nvidia
conda activate ASGRA
pip install -r requirements.txt
python3 setup.py installWe evaluate our approach in two datasets:
- Places8
- RCPD
See datasets.md for more information on how to setup the datasets.
For generating the scene graphs, we utilize this work: Pix2Grp. We did some adaptations to output the scene graphs in the format that we could process. To generate the graphs as we did, follow the instructions in here.
You can download the pretrained weights for the places8 dataset in here.
To train and evaluate the model, you can run the following commands:
CUDA_VISIBLE_DEVICES=0 python3 asgra/main.py -m train -c configs/asgra_best.json -w 8 -o results/run1CUDA_VISIBLE_DEVICES=0 python3 asgra/main.py -m eval -c configs/asgra_best.json -w 8 -o results/eval-run1 --weights path-to-trained-weightsThis repository is built over Pix2Grp, that is built over LAVIS and SGTR. We would like to thank them for their great open-source code and models.