Ziyang Song, Jinxi Li, Bo Yang
We propose the first framework to represent dynamic 3D scenes in infinitely many ways from a monocular RGB video.
Our method enables infinitely sampling of different 3D scenes that match the input monocular video in observed views:
Please first install a GPU-supported pytorch version which fits your machine. We have tested with pytorch 1.13.0.
Then please refer to official guide and install pytorch3d. We have tested with pytorch3d 0.7.5.
Install other dependencies:
pip install -r requirementsOur processed datasets can be downloaded from Google Drive.
If you want to work on your own dataset, please refer to data preparation guide.
You can download all our pre-trained models from Google Drive.
python train.py config/indoor/chessboard.yaml --use_wandbSpecify --use_wandb to log the training with WandB.
python sample.py config/indoor/chessboard.yaml --checkpoint ${CHECKPOINT}${CHECKPOINT} is the checkpoint iterations to be loaded, e.g., 30000.
python test.py config/indoor/chessboard.yaml --checkpoint ${CHECKPOINT} --n_sample_scale_test 1000 --scale_id ${SCALE_ID} --render_testSpecify --render_test to render testing views, otherwise render training views.
python evaluate.py --dataset_path ${DATASET_PATH} --render_path ${RENDER_PATH} --split test --eval_depth --eval_segm --maskSpecify --eval_depth to evaluate depth, --eval_segm to evaluate segmentation, --mask to apply co-visibility mask as in DyCheck.
If you find our work useful in your research, please consider citing:
@article{song2024,
title={{OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos}},
author={Song, Ziyang and Li, Jinxi and Yang, Bo},
journal={ICML},
year={2024}
}