Any6D on MSR Server

This repository provides a simple Flask-based HTTP API wrapper around the original Any6D demo. Instead of running scripts locally, you can deploy the model as a network service and call it from any machine.

What it does

The server exposes a single HTTP endpoint that performs 6D pose estimation and mesh refinement using the Any6D framework.

Endpoint

POST /get_any6d_mesh

Input (multipart/form-data)

Required files:

obj - 3D mesh file in .obj format (with optional vertex colors)
color - RGB color image in .png format
intrinsic - Camera intrinsic matrix in .npy format (3×3 matrix)
mask - Binary mask image in .png format (object segmentation)
depth - Depth map in .npy format (in meters)

Optional parameters:

iteration - Number of registration iterations (default: 5)
debug - Debug level 0 or 1 (default: 0)

Processing Pipeline

The server performs the following steps:

Mesh Loading & Validation
- Loads and validates the input mesh
- Automatically simplifies large meshes (>100k faces) to prevent CUDA OOM errors
- Preserves vertex colors if present in the input mesh
Mesh Alignment
- Aligns the mesh to a canonical coordinate system using oriented bounding box (OBB)
- Scales the mesh appropriately for processing
6D Pose Registration
- Registers the mesh with the observed RGB-D data using Any6D
- Estimates the 6D pose (rotation and translation) of the object
- Refines the mesh scale and alignment based on the observed point cloud
- Uses a render-and-compare strategy with pose hypothesis generation
Mesh Refinement
- Refines the mesh geometry to better match the observed depth data
- Preserves vertex colors from the original mesh when possible

Output

Returns a refined .obj mesh file as an attachment with:

Updated vertex positions based on the registered pose and scale
Preserved vertex colors (if present in the input)
Optimized geometry aligned with the observed RGB-D data

Use Cases

Novel object pose estimation - Estimate 6D pose of objects without prior training
Mesh refinement - Refine 3D meshes using RGB-D observations
Scale estimation - Recover metric scale from a single RGB-D image
Cross-environment adaptation - Handle variations in lighting, occlusions, and viewpoints

Installation

Follow the following steps to install Any6D on the MSR server.

git clone https://github.com/0nhc/Any6D.git
cd Any6D
micromamba create -n any6d python=3.10
micromamba activate any6d
micromamba install conda-forge::eigen=3.4.0 conda-forge::cuda-toolkit=12.1 boost
pip install -r requirements.txt
python -m pip install --quiet --no-cache-dir git+https://github.com/NVlabs/nvdiffrast.git
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder pytorch3d==0.7.8+pt2.4.1cu121
CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.10/site-packages/pybind11/share/cmake/pybind11 bash foundationpose/build_all_conda.sh

# Download Model Weights of FoundationPose
cd foundationpose/weights
./download_weights.sh
cd ../..

Quick Start

# Run these commands on the Lamb or Sheep server.
export CUDA_HOME=$CONDA_PREFIX
export CPATH=$CONDA_PREFIX/targets/x86_64-linux/include:$CONDA_PREFIX/include:$CPATH
export LD_LIBRARY_PATH=$CONDA_PREFIX/targets/x86_64-linux/lib:$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
python flask_server.py

# Run this script on your own laptop.
python flask_client.py

Any6D: Model-free 6D Pose Estimation of Novel Objects

This is the official implementation of our paper accepted by CVPR 2025

[Website] [Paper]

Authors: Taeyeop Lee, Bowen Wen, Minjun Kang, Gyuree Kang, In So Kweon, Kuk-Jin Yoon

Abstract

We introduce Any6D, a model-free framework for 6D object pose estimation that requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. Unlike existing methods that rely on textured 3D models or multiple viewpoints, Any6D leverages a joint object alignment process to enhance 2D-3D alignment and metric scale estimation for improved pose accuracy. Our approach integrates a render-and-compare strategy to generate and refine pose hypotheses, enabling robust performance in scenarios with occlusions, non-overlapping views, diverse lighting conditions, and large cross-environment variations. We evaluate our method on five challenging datasets: REAL275, Toyota-Light, HO3D, YCBINEOAT, and LM-O, demonstrating its effectiveness in significantly outperforming state-of-the-art methods for novel object pose estimation.

Installation

# create conda environment
conda create -n Any6D python=3.9

# activate conda environment
conda activate Any6D

# Install Eigen3 3.4.0 under conda environment
conda install conda-forge::eigen=3.4.0
export CMAKE_PREFIX_PATH="$CMAKE_PREFIX_PATH:/eigen/path/under/conda"

# install dependencies (cuda 12.1, torch 2.4.1)
python -m pip install -r requirements.txt

# Install NVDiffRast
python -m pip install --quiet --no-cache-dir git+https://github.com/NVlabs/nvdiffrast.git

# Kaolin
python -m pip install --no-cache-dir kaolin==0.16.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.4.0_cu121.html

# PyTorch3D
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder pytorch3d==0.7.8+pt2.4.1cu121

# Build extensions
CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.9/site-packages/pybind11/share/cmake/pybind11 bash foundationpose/build_all_conda.sh

# build SAM2
cd sam2 && pip install -e . && cd checkpoints && \
./download_ckpts.sh && \
cd ..

# build InstantMesh 
cd instantmesh && pip install -e . && cd ..
# build bop_toolkit
cd bop_toolkit && python setup.py install && cd ..

CheckPoints

Download the model checkpoints from the following

Create the directory structure as follows:

foundationpose/
└── weights/
    ├── 2024-01-11-20-02-45/
    └── 2023-10-28-18-33-37/

sam2/
└── checkpoints/
    ├── sam2.1_hiera_large.pt
    
instantmesh/
└── ckpts/
    ├── diffusion_pytorch_model.bin
    └── instant_mesh_large.ckpt

Download dataset

Run Demo

python run_demo.py
python run_demo.py --img_to_3d # running instantmesh + sam2

Run on Public Datasets (HO3D)

Dataset Format

ho3d/
├── evaluation/         # HO3D evaluation files (e.g., annotations)
├── masks_XMem/         # Segmentation masks generated by XMem
└── YCB_Video_Models/   # 3D models for YCB objects (used in HO3D)

1. Run Anchor Image

We provided our input, image-to-3d results and anchor results huggingface.

python run_ho3d_anchor.py \
  --anchor_folder /anchor_results/dexycb_reference_view_ours \    # Path to anchor results
  --ycb_model_path /dataset/ho3d/YCB_Video_Models                 # Path to YCB models
  # --img_to_3d                                                   # Running instantmesh + sam2

2. Run Query Image

python run_ho3d_query.py \
  --anchor_path /anchor_results/dexycb_reference_view_ours \     # Path to anchor results
  --hot3d_data_root /dataset/ho3d \                              # Root path to HO3D dataset
  --ycb_model_path /dataset/ho3d/YCB_Video_Models                # Path to YCB models

Acknowledgement

We would like to acknowledge the contributions of public projects FoundationPose, InstantMesh, SAM2, Oryon and bop_toolkit for their code release. We also thank the CVPR reviewers and Area Chair for their appreciation of this work and their constructive feedback.

Citations

@inproceedings{lee2025any6d,
    title     = {{Any6D}: Model-free 6D Pose Estimation of Novel Objects},
    author    = {Lee, Taeyeop and Wen, Bowen and Kang, Minjun and Kang, Gyuree and Kweon, In So and Yoon, Kuk-Jin},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year      = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
bop_toolkit		bop_toolkit
demo_data		demo_data
foundationpose		foundationpose
instantmesh		instantmesh
sam2		sam2
teaser		teaser
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
estimater.py		estimater.py
flask_client.py		flask_client.py
flask_server.py		flask_server.py
metrics.py		metrics.py
models_info.json		models_info.json
requirements.txt		requirements.txt
run_demo.py		run_demo.py
run_ho3d_anchor.py		run_ho3d_anchor.py
run_ho3d_query.py		run_ho3d_query.py
sam2_instantmesh.py		sam2_instantmesh.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Any6D on MSR Server

What it does

Endpoint

Input (multipart/form-data)

Processing Pipeline

Output

Use Cases

Installation

Quick Start

Any6D: Model-free 6D Pose Estimation of Novel Objects

Abstract

Installation

CheckPoints

Download dataset

Run Demo

Run on Public Datasets (HO3D)

Dataset Format

1. Run Anchor Image

2. Run Query Image

Acknowledgement

Citations

About

Uh oh!

Releases

Packages

Languages

License

0nhc/Any6D

Folders and files

Latest commit

History

Repository files navigation

Any6D on MSR Server

What it does

Endpoint

Input (multipart/form-data)

Processing Pipeline

Output

Use Cases

Installation

Quick Start

Any6D: Model-free 6D Pose Estimation of Novel Objects

Abstract

Installation

CheckPoints

Download dataset

Run Demo

Run on Public Datasets (HO3D)

Dataset Format

1. Run Anchor Image

2. Run Query Image

Acknowledgement

Citations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages