Thanks to visit codestin.com
Credit goes to github.com

Skip to content

0nhc/Any6D

 
 

Repository files navigation

Any6D on MSR Server

This repository provides a simple Flask-based HTTP API wrapper around the original Any6D demo. Instead of running scripts locally, you can deploy the model as a network service and call it from any machine.

What it does

The server exposes a single HTTP endpoint that performs 6D pose estimation and mesh refinement using the Any6D framework.

Endpoint

POST /get_any6d_mesh

Input (multipart/form-data)

Required files:

  • obj - 3D mesh file in .obj format (with optional vertex colors)
  • color - RGB color image in .png format
  • intrinsic - Camera intrinsic matrix in .npy format (3×3 matrix)
  • mask - Binary mask image in .png format (object segmentation)
  • depth - Depth map in .npy format (in meters)

Optional parameters:

  • iteration - Number of registration iterations (default: 5)
  • debug - Debug level 0 or 1 (default: 0)

Processing Pipeline

The server performs the following steps:

  1. Mesh Loading & Validation

    • Loads and validates the input mesh
    • Automatically simplifies large meshes (>100k faces) to prevent CUDA OOM errors
    • Preserves vertex colors if present in the input mesh
  2. Mesh Alignment

    • Aligns the mesh to a canonical coordinate system using oriented bounding box (OBB)
    • Scales the mesh appropriately for processing
  3. 6D Pose Registration

    • Registers the mesh with the observed RGB-D data using Any6D
    • Estimates the 6D pose (rotation and translation) of the object
    • Refines the mesh scale and alignment based on the observed point cloud
    • Uses a render-and-compare strategy with pose hypothesis generation
  4. Mesh Refinement

    • Refines the mesh geometry to better match the observed depth data
    • Preserves vertex colors from the original mesh when possible

Output

Returns a refined .obj mesh file as an attachment with:

  • Updated vertex positions based on the registered pose and scale
  • Preserved vertex colors (if present in the input)
  • Optimized geometry aligned with the observed RGB-D data

Use Cases

  • Novel object pose estimation - Estimate 6D pose of objects without prior training
  • Mesh refinement - Refine 3D meshes using RGB-D observations
  • Scale estimation - Recover metric scale from a single RGB-D image
  • Cross-environment adaptation - Handle variations in lighting, occlusions, and viewpoints

Installation

Follow the following steps to install Any6D on the MSR server.

git clone https://github.com/0nhc/Any6D.git
cd Any6D
micromamba create -n any6d python=3.10
micromamba activate any6d
micromamba install conda-forge::eigen=3.4.0 conda-forge::cuda-toolkit=12.1 boost
pip install -r requirements.txt
python -m pip install --quiet --no-cache-dir git+https://github.com/NVlabs/nvdiffrast.git
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder pytorch3d==0.7.8+pt2.4.1cu121
CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.10/site-packages/pybind11/share/cmake/pybind11 bash foundationpose/build_all_conda.sh

# Download Model Weights of FoundationPose
cd foundationpose/weights
./download_weights.sh
cd ../..

Quick Start

# Run these commands on the Lamb or Sheep server.
export CUDA_HOME=$CONDA_PREFIX
export CPATH=$CONDA_PREFIX/targets/x86_64-linux/include:$CONDA_PREFIX/include:$CPATH
export LD_LIBRARY_PATH=$CONDA_PREFIX/targets/x86_64-linux/lib:$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
python flask_server.py
# Run this script on your own laptop.
python flask_client.py

Any6D: Model-free 6D Pose Estimation of Novel Objects

This is the official implementation of our paper accepted by CVPR 2025

[Website] [Paper]

Authors: Taeyeop Lee, Bowen Wen, Minjun Kang, Gyuree Kang, In So Kweon, Kuk-Jin Yoon

Abstract

We introduce Any6D, a model-free framework for 6D object pose estimation that requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. Unlike existing methods that rely on textured 3D models or multiple viewpoints, Any6D leverages a joint object alignment process to enhance 2D-3D alignment and metric scale estimation for improved pose accuracy. Our approach integrates a render-and-compare strategy to generate and refine pose hypotheses, enabling robust performance in scenarios with occlusions, non-overlapping views, diverse lighting conditions, and large cross-environment variations. We evaluate our method on five challenging datasets: REAL275, Toyota-Light, HO3D, YCBINEOAT, and LM-O, demonstrating its effectiveness in significantly outperforming state-of-the-art methods for novel object pose estimation.

Installation

# create conda environment
conda create -n Any6D python=3.9

# activate conda environment
conda activate Any6D

# Install Eigen3 3.4.0 under conda environment
conda install conda-forge::eigen=3.4.0
export CMAKE_PREFIX_PATH="$CMAKE_PREFIX_PATH:/eigen/path/under/conda"

# install dependencies (cuda 12.1, torch 2.4.1)
python -m pip install -r requirements.txt

# Install NVDiffRast
python -m pip install --quiet --no-cache-dir git+https://github.com/NVlabs/nvdiffrast.git

# Kaolin
python -m pip install --no-cache-dir kaolin==0.16.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.4.0_cu121.html

# PyTorch3D
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder pytorch3d==0.7.8+pt2.4.1cu121

# Build extensions
CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.9/site-packages/pybind11/share/cmake/pybind11 bash foundationpose/build_all_conda.sh

# build SAM2
cd sam2 && pip install -e . && cd checkpoints && \
./download_ckpts.sh && \
cd ..

# build InstantMesh 
cd instantmesh && pip install -e . && cd ..
# build bop_toolkit
cd bop_toolkit && python setup.py install && cd .. 

CheckPoints

Download the model checkpoints from the following

Create the directory structure as follows:

foundationpose/
└── weights/
    ├── 2024-01-11-20-02-45/
    └── 2023-10-28-18-33-37/

sam2/
└── checkpoints/
    ├── sam2.1_hiera_large.pt
    
instantmesh/
└── ckpts/
    ├── diffusion_pytorch_model.bin
    └── instant_mesh_large.ckpt

Download dataset

Run Demo

python run_demo.py
python run_demo.py --img_to_3d # running instantmesh + sam2

Run on Public Datasets (HO3D)

Dataset Format

ho3d/
├── evaluation/         # HO3D evaluation files (e.g., annotations)
├── masks_XMem/         # Segmentation masks generated by XMem
└── YCB_Video_Models/   # 3D models for YCB objects (used in HO3D)

1. Run Anchor Image

We provided our input, image-to-3d results and anchor results huggingface.

python run_ho3d_anchor.py \
  --anchor_folder /anchor_results/dexycb_reference_view_ours \    # Path to anchor results
  --ycb_model_path /dataset/ho3d/YCB_Video_Models                 # Path to YCB models
  # --img_to_3d                                                   # Running instantmesh + sam2

2. Run Query Image

python run_ho3d_query.py \
  --anchor_path /anchor_results/dexycb_reference_view_ours \     # Path to anchor results
  --hot3d_data_root /dataset/ho3d \                              # Root path to HO3D dataset
  --ycb_model_path /dataset/ho3d/YCB_Video_Models                # Path to YCB models

Acknowledgement

We would like to acknowledge the contributions of public projects FoundationPose, InstantMesh, SAM2, Oryon and bop_toolkit for their code release. We also thank the CVPR reviewers and Area Chair for their appreciation of this work and their constructive feedback.

Citations

@inproceedings{lee2025any6d,
    title     = {{Any6D}: Model-free 6D Pose Estimation of Novel Objects},
    author    = {Lee, Taeyeop and Wen, Bowen and Kang, Minjun and Kang, Gyuree and Kweon, In So and Yoon, Kuk-Jin},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year      = {2025},
}

About

[CVPR 2025] Any6D: Model-free 6D Pose Estimation of Novel Objects

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 91.7%
  • Python 6.3%
  • TypeScript 1.8%
  • GLSL 0.1%
  • Cuda 0.1%
  • Shell 0.0%