Fun3DU : Functional Understanding and Segmentation in 3D Scenes

Official implementation and website of Fun3DU, a novel method for functional understanding and segmentation in 3D scenes. The technical report is available on arxiv.

News

06/04/25: Fun3DU will be presented as highlight paper.
02/04/25: Code for Fun3DU has been released.
27/02/25: Fun3DU has been accepted at CVPR25!

Dataset preparation

NB: the split0 and split1 splits mentioned in the paper correspond to the validation and training split of SceneFun3D, respectively.

Create dataset root folder $ROOT (the dataset scripts assume it to be data/scenefun3d/)
Download the original dataset folder and put in the $ROOT.
Create the two splits (SceneFun3D provides them as a single file), by running the following scripts:

python scripts/make_video_list.py train
python scripts/make_video_list.py val

Download the splits with the following scripts:

python scripts/sun3d/data_asset_download.py --split custom --video_id_csv $ROOT/benchmark_file_lists/val_set.csv --download_dir $ROOT/val --dataset_asset laser_scan_5mm crop_mask annotations descriptions hires_wide hires_wide_intrinsics hires_depth hires_poses

python scripts/sun3d/data_asset_download.py --split custom --video_id_csv $ROOT/benchmark_file_lists/train_set.csv --download_dir $ROOT/train --dataset_asset laser_scan_5mm crop_mask annotations descriptions hires_wide hires_wide_intrinsics hires_depth hires_poses

Setup environment

TODO. requirements.txt show a partial list of the required packages.

Running Fun3DU

Fun3DU is divided in 4 scripts, each executing one of the 4 steps mentioned in the paper (LLM preprocessing, Context object segmentation, Functional object segmentation, and multi-view agreement). The intermediate results will be saved in the dataset folder for the LLM processing and the contextual object segmentation steps, and in a specific experiment folder for the remaining two steps. This allows to try the method with various configurations without rerunning all steps.

LLM preprocessing (LLama3.1)

This will generate a .json file for each visit, containing the results of the LLM reasoning on each description.

Download ollama from here https://ollama.com
install the python library pip install ollama
Run ollama serve to start the ollama backend (it should start the backend on localhost:11434)
In another window, run ollama pull llama3.1 to download Llama
If the port is not 11434, update OLLAMA_PORT to the correct port in run_llm.py (line 8).
Run python run_llm.py dataset.root=$ROOT dataset.split=val llm_type=llama

Contextual object segmentation (Owl2+RSAM)

Run the following script to save the predicted masks for all contextual object in a scene:

python run_detection.py dataset.root=$ROOT dataset.split=val llm_type=llama mask_type=standard

Functional object segmentation (Molmo)

Choose an exp_root to save the experiments intermediate results (exps by default). Run the following script to save the predicted masks for functional objects in a scene:

python run_molmo.py dataset.root=$ROOT dataset.split=val llm_type=llama mask_type=standard exp_name=$EXP

By default, intermediate results will be saved in exps/$EXP/frames. This can be changed in the config.

Multi-view agreement

Run the following script to lift the predicted masks from the previous step and produce a point cloud:

python run_lifting.py dataset.root=$ROOT dataset.split=val mask_type=standard exp_name=$EXP

By default, point cloud data will be saved in exps/$EXP/pcds. This can be changed in the config.

Evaluation

Run the following script to evaluate an experiment with a fixed threshold:

python evaluate.py dataset.root=$ROOT dataset.split=val exp_name=$EXP threshold=0.7

By default, this will evaulate with point cloud data in exps/$EXP/pcds. This can be changed in the config.

Citing Fun3DU

If you find Fun3DU useful for your work, consider citing it:

  @inproceedings{corsetti2025fun3du,
    title={Functionality understanding and segmentation in 3D scenes},
    author={Corsetti, Jaime and Giuliari, Francesco and Fasoli, Alice and Boscaini, Davide and Poiesi, Fabio},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2025}
  }

Acknowledgements

We thank the authors of SceneFun3D for the dataset toolkit, on which our implementation is based. This work was supported by the European Union’s Horizon Europe research and innovation programme under grant agreement No 101058589 (AI-PRISM). We also acknowledge ISCRA for awarding this project access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CINECA (Italy).

Website License

The website template is from Nerfies.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
configs		configs
scripts		scripts
static		static
utils		utils
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
index.html		index.html
requirements.txt		requirements.txt
run_detection.py		run_detection.py
run_lifting.py		run_lifting.py
run_llm.py		run_llm.py
run_molmo.py		run_molmo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fun3DU : Functional Understanding and Segmentation in 3D Scenes

News

Dataset preparation

Setup environment

Running Fun3DU

LLM preprocessing (LLama3.1)

Contextual object segmentation (Owl2+RSAM)

Functional object segmentation (Molmo)

Multi-view agreement

Evaluation

Citing Fun3DU

Acknowledgements

Website License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

tev-fbk/fun3du

Folders and files

Latest commit

History

Repository files navigation

Fun3DU : Functional Understanding and Segmentation in 3D Scenes

News

Dataset preparation

Setup environment

Running Fun3DU

LLM preprocessing (LLama3.1)

Contextual object segmentation (Owl2+RSAM)

Functional object segmentation (Molmo)

Multi-view agreement

Evaluation

Citing Fun3DU

Acknowledgements

Website License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages