This repository contains the source code for our paper:
Iterative Geometry Encoding Volume for Stereo Matching
CVPR 2023
Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang
Pretrained models can be downloaded from google drive
You can demo a trained model on pairs of images. To predict stereo for Middlebury, run
python demo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth
- NVIDIA RTX 3090
- Python 3.8
- Pytorch 1.12
conda create -n IGEV_Stereo python=3.8
conda activate IGEV_Stereo
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib
pip install tqdm
pip install timm==0.5.4
To evaluate/train IGEV-Stereo, you will need to download the required datasets.
By default stereo_datasets.py will search for the datasets in these locations.
├── /data
├── sceneflow
├── frames_finalpass
├── disparity
├── KITTI
├── KITTI_2012
├── training
├── testing
├── vkitti
├── KITTI_2015
├── training
├── testing
├── vkitti
├── Middlebury
├── trainingH
├── trainingH_GT
├── ETH3D
├── two_view_training
├── two_view_training_gt
├── DTU_data
├── dtu_train
├── dtu_test
To evaluate on Scene Flow or Middlebury or ETH3D, run
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflowor
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_Hor
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3dTo train on Scene Flow, run
python train_stereo.pyTo train on KITTI, run
python train_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset kittiFor submission to the KITTI benchmark, run
python save_disp.pyTo train on DTU, run
python train_mvs.pyTo evaluate on DTU, run
python evaluate_mvs.pyIf you find our work useful in your research, please consider citing our paper:
@inproceedings{xu2023iterative,
title={Iterative Geometry Encoding Volume for Stereo Matching},
author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2023}
}This project is heavily based on RAFT-Stereo, We thank the original authors for their excellent work.