Official implementation and website of Fun3DU, a novel method for functional understanding and segmentation in 3D scenes. The technical report is available on arxiv.
- 06/04/25: Fun3DU will be presented as highlight paper.
- 02/04/25: Code for Fun3DU has been released.
- 27/02/25: Fun3DU has been accepted at CVPR25!
NB: the split0 and split1 splits mentioned in the paper correspond to the validation and training split of SceneFun3D, respectively.
- Create dataset root folder
$ROOT(the dataset scripts assume it to bedata/scenefun3d/) - Download the original dataset folder and put in the
$ROOT. - Create the two splits (SceneFun3D provides them as a single file), by running the following scripts:
python scripts/make_video_list.py train
python scripts/make_video_list.py val
- Download the splits with the following scripts:
python scripts/sun3d/data_asset_download.py --split custom --video_id_csv $ROOT/benchmark_file_lists/val_set.csv --download_dir $ROOT/val --dataset_asset laser_scan_5mm crop_mask annotations descriptions hires_wide hires_wide_intrinsics hires_depth hires_poses
python scripts/sun3d/data_asset_download.py --split custom --video_id_csv $ROOT/benchmark_file_lists/train_set.csv --download_dir $ROOT/train --dataset_asset laser_scan_5mm crop_mask annotations descriptions hires_wide hires_wide_intrinsics hires_depth hires_poses
TODO.
requirements.txt show a partial list of the required packages.
Fun3DU is divided in 4 scripts, each executing one of the 4 steps mentioned in the paper (LLM preprocessing, Context object segmentation, Functional object segmentation, and multi-view agreement). The intermediate results will be saved in the dataset folder for the LLM processing and the contextual object segmentation steps, and in a specific experiment folder for the remaining two steps. This allows to try the method with various configurations without rerunning all steps.
This will generate a .json file for each visit, containing the results of the LLM reasoning on each description.
- Download ollama from here https://ollama.com
- install the python library
pip install ollama - Run
ollama serveto start the ollama backend (it should start the backend onlocalhost:11434) - In another window, run
ollama pull llama3.1to download Llama - If the port is not 11434, update
OLLAMA_PORTto the correct port inrun_llm.py(line 8). - Run
python run_llm.py dataset.root=$ROOT dataset.split=val llm_type=llama
Run the following script to save the predicted masks for all contextual object in a scene:
python run_detection.py dataset.root=$ROOT dataset.split=val llm_type=llama mask_type=standard
Choose an exp_root to save the experiments intermediate results (exps by default).
Run the following script to save the predicted masks for functional objects in a scene:
python run_molmo.py dataset.root=$ROOT dataset.split=val llm_type=llama mask_type=standard exp_name=$EXP
By default, intermediate results will be saved in exps/$EXP/frames. This can be changed in the config.
Run the following script to lift the predicted masks from the previous step and produce a point cloud:
python run_lifting.py dataset.root=$ROOT dataset.split=val mask_type=standard exp_name=$EXP
By default, point cloud data will be saved in exps/$EXP/pcds. This can be changed in the config.
Run the following script to evaluate an experiment with a fixed threshold:
python evaluate.py dataset.root=$ROOT dataset.split=val exp_name=$EXP threshold=0.7
By default, this will evaulate with point cloud data in exps/$EXP/pcds. This can be changed in the config.
If you find Fun3DU useful for your work, consider citing it:
@inproceedings{corsetti2025fun3du,
title={Functionality understanding and segmentation in 3D scenes},
author={Corsetti, Jaime and Giuliari, Francesco and Fasoli, Alice and Boscaini, Davide and Poiesi, Fabio},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}
We thank the authors of SceneFun3D for the dataset toolkit, on which our implementation is based. This work was supported by the European Union’s Horizon Europe research and innovation programme under grant agreement No 101058589 (AI-PRISM). We also acknowledge ISCRA for awarding this project access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CINECA (Italy).
The website template is from Nerfies.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.