Jay Karhade Nikhil Keetha Yuchen Zhang Tanisha Gupta
Akash Sharma Sebastian Scherer Deva Ramanan
Carnegie Mellon University
TLDR: Any4D is a multi-view transformer for
• Feed-forward • Dense • Metric-scale • Multi-modal
4D reconstruction of dynamic scenes from RGB videos and diverse setups.
- The inference code will be refined and updated over the next few days.
- A stronger and more generalizable model checkpoint, along with full training code will be released soon.
Stay tuned for updates
git clone https://github.com/Any-4D/Any4D.git
cd Any4D
# Create and activate conda environment
conda create -n any4d python=3.12 -y
conda activate any4d
# Optional: Install torch, torchvision & torchaudio specific to your system
# Install Any4D
pip install -e .
# For all optional dependencies
# See pyproject.toml for more details
pip install -e ".[all]"
pre-commit installNote that we don't pin a specific version of PyTorch or CUDA in our requirements. Please feel free to install PyTorch based on your specific system.
We release the pre-trained Any4D model checkpoint on Hugging Face and Google Drive:
# Option 1: Hugging Face
mkdir -p checkpoints
wget -P checkpoints https://huggingface.co/airlabshare/any4d-checkpoint/resolve/main/any4d_4v_combined.pth# Option 2: Google Drive
mkdir -p checkpoints
cd checkpoints
gdown --folder https://drive.google.com/drive/folders/1SOWr61vuv_bGtow6diAiWpIoUT50qSpkFor quick example inference, you can run the following command:
# Terminal 1: Start the Rerun server
rerun serve --port 9877
# Terminal 2: Run Any4D demo
python scripts/demo_inference.py --video_images_folder_path assets/stroller --viz --port 9877We provide multiple examples at assets/example_images. Please look at Rerun Demo for more control over visualization.
We provide multiple interactive demos to try out Any4D!
Try our online demo without installation: 🤗 Hugging Face Demo
We provide a script to launch our Gradio app. The interface and GUI allows you to upload image sequences/videos, run 4D reconstruction and interactively view them. You can launch this using:
# Install requirements for the app
pip install -e ".[gradio]"
# Launch app locally
python scripts/any4d_gradio.pyWe provide a demo script for interactive 4D visualization of metric reconstruction results using Rerun.
# Terminal 1: Start the Rerun server
rerun serve --port 9877 --web-viewer-port 9879
# Terminal 2: Run Any4D demo
python scripts/demo_inference.py \
--image_folder /path/to/your/image/sequence \
--checkpoint_path /path/to/your/checkpoint \
--start_idx start_num \
--end_idx end_num \
--ref_img_idx ref_num \
--ref_img_binary_mask_path /path/to/ref/image/binary/mask \
--use_scene_flow_mask_refined True \
--viz \
--port 9877 \
# Terminal 3 or Local Machine: Open web viewer at http://127.0.0.1:9879 (You might need to port forward if using a remote server)Optionally, if rerun is installed locally, local rerun viewer can be spawned using: rerun --connect rerun+http://127.0.0.1:2004/proxy.
We thank the following projects for their open-source code: MapAnything, DUSt3R, MASt3R, MoGe, VGGT, and DINOv2.
If you find our repository useful, please consider giving it a star ⭐ and citing our paper in your work:
@misc{karhade2025any4d,
title={{Any4D}: Unified Feed-Forward Metric {4D} Reconstruction},
author={Jay Karhade and Nikhil Keetha and Yuchen Zhang and Tanisha Gupta and Akash Sharma and Sebastian Scherer and Deva Ramanan},
year={2025},
note={arXiv preprint}
}