[CVPR2025] 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

😊LangSplat Family

@inproceedings{li20254d,
  title={4d langsplat: 4d language gaussian splatting via multimodal large language models},
  author={Li, Wanhua and Zhou, Renping and Zhou, Jiawei and Song, Yingwei and Herter, Johannes and Qin, Minghan and Huang, Gao and Pfister, Hanspeter},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={22001--22011},
  year={2025}
}

🎉 Our work is based on LangSplat, and we thank them for their contributions! This work ground CLIP features into a set of 3D language Gaussians, which attains precise 3D language fields while being 199 × faster than LERF. [CVPR 2024] LangSplat

@inproceedings{qin2024langsplat,
  title={Langsplat: 3d language gaussian splatting},
  author={Qin, Minghan and Li, Wanhua and Zhou, Jiawei and Wang, Haoqian and Pfister, Hanspeter},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={20051--20060},
  year={2024}
}

🎉 We have released LangSplat V2! The new version significantly improves performance, achieving over 450+ FPS in rendering. [NeurIPS 2025] LangSplat V2

@article{li2025langsplatv2,
  title={LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS},
  author={Li, Wanhua and Zhao, Yujie and Qin, Minghan and Liu, Yang and Cai, Yuanhao and Gan, Chuang and Pfister, Hanspeter},
  journal={arXiv preprint arXiv:2507.07136},
  year={2025}
}

BibTeX

@inproceedings{li20254dlangsplat4dlanguage,
    title={4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models}, 
    author={Wanhua Li and Renping Zhou and Jiawei Zhou and Yingwei Song and Johannes Herter and Minghan Qin and Gao Huang and Hanspeter Pfister},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2025}
}

Cloning the Repository

The repository contains submodules, thus please check it out with

git clone [email protected]:zrporz/4DLangSplat.git --recursive

Setup

4D LangSplat uses the following software versions:

Python 3.10
CUDA 12.4
GCC 10.2.0

On default, run the following commands to install the relative packages

conda create -n 4DLangSplat python=3.10
conda activate 4DLangSplat
pip install -r requirements.txt
### submodules for gaussian rasterization ###
pip install -e submodules/simple-knn
pip install -e submodules/4d-langsplat-rasterization
### submodules for generate segmentation map ###
pip install -e submodules/4d-langsplat-tracking-anything-with-deva
pip install git+https://github.com/facebookresearch/segment-anything.git

Prepare Datasets

Our models are trained and evaluated on HyperNeRF and Neu3D datasets. Please follow their instructions to prepare your dataset, or run the following commands:

bash scripts/download_hypernerf.sh data/hypernerf
bash scripts/download_neu3d.sh data/neu3d

To evaluate the rendering results, we use RoboFlow to annotate the datasets. The annotations can be accessed through this link: Download the Annotations.
Follow 4DGaussians, we use COLMAP to generate the point clouds. Please follow their pipeline, or use ours: Download the Point Clouds

Then put them under data/<hypernerf or neu3d>/<dataset name>. You need to ensure that the data folder is organized as follows:

|——data
|   | hypernerf
|       | americano
|           |——annotations
|               |——train
|               |——README
|               |——video_annotations.json
|           |——camera
|           |——rgb
|               |——1x
|                   |——000001.png
|                   ...
|               |——2x        
|               ...
|           |——dataset.json
|           |——metadata.json
|           |——points.npy
|           |——scene.json
|           |——points3D_downsample2.ply
|       |——chickchicken
|       ...
|   | neu3d
|       | coffee_martini
|           |——annotations
|               |——train
|               |——README
|           |——cam00
|               |——images
|                   |——0000.png
|                   ...
|           |——cam01
|           ...
|           |——cam00.mp4
|           |——cam01.mp4
|           ...
|           |——poses_bounds.npy
|           |——points3D_downsample2.ply
|      |——cur_roasted_beef
|      ...

QuickStart

We provide the pretrained checkpoints of gaussian model and autoencoder: Download Pretrained Checkpoint.

For HyperNeRF dataset, take americano as an example. Put checkpoint folder upder the output/hypernerf/americano and run the following commands for rendering and evaluation

bash scripts/render-hypernerf.sh
bash scripts/eval-hypernerf.sh

For Neu3D dataset, take coffee_martini as an example. Put checkpoint folder under the output/neu3d/coffee_martini and run the following commands for rendering and evaluation

bash scripts/render-neu3d.sh
bash scripts/eval-neu3d.sh

The evaluation results will be saved under eval/eval_results.

Training Guide

Step 1: Generate Segmentation Map using DEVA

First execute the demo script to generate segmentation maps:

cd submodules/4d-langsplat-tracking-anything-with-deva
bash scripts/download_models.sh # Download the model parameters if you are a first time user 
bash scripts/demo-chickchicken.sh

The output segmentation maps will be saved in submodules/4d-langsplat-tracking-anything-with-deva/output

Step 2: Extract CLIP and Video Features

Extract CLIP features:

bash scripts/extract_clip_features.sh

Generate video features:

bash scripts/generate-video-feature.sh

These commands will create two feature directories under your dataset path:

clip_features: Extracted by CLIP model
video_features: Extracted by E5 model

Step 3: Train and Evaluate 4D LangSplat

Run the training and evaluation script:

bash scripts/train_eval.sh

This will train the 4D LangSplat field and perform evaluation.

TODO list

release the code of the 4d-langsplat-rasterization
release the code of the 4d-langsplat-tracking-anything-with-deva
release the code of the evaluation
release the code of the autoencoder
release the code of preprocessing
release the code of training
release the the pretrained model
release the preprocessed dataset
update the arxiv link

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
arguments		arguments
assets		assets
autoencoder		autoencoder
eval		eval
gaussian_renderer		gaussian_renderer
preprocess		preprocess
scene		scene
scripts		scripts
submodules		submodules
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
render.py		render.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[CVPR2025] 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

😊LangSplat Family

BibTeX

Cloning the Repository

Setup

Prepare Datasets

QuickStart

Training Guide

Step 1: Generate Segmentation Map using DEVA

Step 2: Extract CLIP and Video Features

Step 3: Train and Evaluate 4D LangSplat

TODO list

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

zrporz/4DLangSplat

Folders and files

Latest commit

History

Repository files navigation

[CVPR2025] 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

😊LangSplat Family

BibTeX

Cloning the Repository

Setup

Prepare Datasets

QuickStart

Training Guide

Step 1: Generate Segmentation Map using DEVA

Step 2: Extract CLIP and Video Features

Step 3: Train and Evaluate 4D LangSplat

TODO list

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages