EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos

Chengbo Yuan, Geng Chen, Li Yi, Yang Gao.

EgoMono4D is an early exploration to apply self-supervised learning for generalizable 4D pointclouds sequence reconstruction to the label-scarce egocentric field, enabling fast and dense reconstruction. Different from supervised methods like DUSt3R or MonSt3R, it is trained solely on large-scale unlabeled video datasets, representing a new paradigm for 4D reconstrucion.

Installation

Install Conda Environment

First build the basic environment:

conda create -n egomono4d python=3.11
conda activate egomono4d
conda install -y pytorch==2.0.1 torchvision cudatoolkit=11.7 -c pytorch -c nvidia

Install UniDepthV1.5 and CoTrackerV2

Our project is based on UniDepth (bebc4b2) and CoTrackerV2. These two repositories have made updation after our project release. Therefore, we need to roll back to the specific commits and then install them.

For UniDepth, we depend on the commit bebc4b2. You could run the following commands to install it:

git clone https://github.com/lpiccinelli-eth/UniDepth.git
cd UniDepth
git reset --hard bebc4b2
pip install -e .

For CoTracker, we depend on CotrackerV2 version, which is the cotracker2v1_release branch of the official repository. Run the following commands to install it:

git clone --branch cotracker2v1_release --single-branch https://github.com/facebookresearch/co-tracker.git
cd co-tracker
pip install -e .

Install Dependencies

pip install -r requirement.txt

If you want to conduct evaluation, please also install these:

pip install kaolin==0.16.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.0.1_cu117.html
pip install evo --upgrade --no-binary evo

Handle Pretrained Models Through Cache Folder

Finally, we manage the checkpoints of the off-the-shelf models together in a cache folder. The default directory is ./cache, which could be changed in the configuration file. The structure of the folder should be:

- ./cache
   - cotracker_checkpoints
      - cotracker2.pth
   - data_custom
   - gmflow_checkpoints
      - gmflow-scale1-mixdata-train320x576-4c3a6e9a.pth
   - models
   - unidepth_v2_checkpoints
      - unidepth-v2-vitl14.bin
      - unidepth-v2-vitl14.json

You could unzip the cache.zip to check the details. Please download these checkpoints / weights and put them into the correct directory. Link: UniDepth, CoTrackerV2, GMFlow.

These commands download the corresponding checkpoints from Hugging Face. Replace /path/to/cache with your own local directory path.

# download Unidepth checkpoints from huggingface
huggingface-cli download lpiccinelli/unidepth-v2-vitl14 --revision 1d0d3c5 --local-dir /path/to/cache/unidepth_v2_checkpoints/

# download gmflow checkpoints from huggingface
huggingface-cli download pranay-ar/unimatch gmflow-scale1-mixdata-train320x576-4c3a6e9a.pth --local-dir /path/to/cache/gmflow_checkpoints/

#download cotracker checkpoints from huggingface
huggingface-cli download facebook/cotracker cotracker2.pth  --local-dir /path/to/cache/cotracker_checkpoints/

Getting Start

To run the example of EgoMono4D, first download the pretrain weight (ptr_all_350k) and put it in ./cache/models. Then run the below command:

INFER_MODE=True python -m egomono4d.inference_demo -m cache/models/ptr_all_350k -f examples/example_epic_kitchen

This will conduct EgoMono4D prediction and generate two result files (result_example_epic_kitchen.ply and result_example_epic_kitchen.pkl) automatically. The ply file is the overlay pointclouds of all frames, which could be opened with software like MeshLab. The pkl file is a pickle file containing a dict with:

result = {
    "xyzs": xyzs,               # (f, h, w, 3)
    "rgbs": rgbs,               # (f, h, w, 3)
    "depths": deps,             # (f, h, w),    depths used to recover point clouds. 
    "intrinsics": intrinsics,   # (f, 3, 3),    pixel2camera intrinsic
    "extrinsics": extrinsics,   # (f, 4, 4),    camera2world extrinsics
    "flys": fly_masks,          # (f, h, w),    flys of depths
    "weights": weight_msk,      # (f-1, h, w),  confident masks
}

We provide a script for 4D visualization. Please run:

python -m egomono4d.demo_viser -f result_example_epic_kitchen.pkl

and then open http://localhost:8080/ in your browser. An interactable GUI for will be available for better 4D result visualization.

Model Training

For model training and evaluation, we need to first pre-process the datasets. The data pre-process depends on EgoHOS model. Please first install the repository locally and put the checkpoints of EgoHOS (the work_dirs folder from EgoHOS official) to ./cache/ego_hos_checkpoints.

To start model traninig and evaluation, we need to first process the datasets. We manage the un-processed orginal datasets in ./cache/original_datasets and the processed datasets in ./cache/processed_datasets. Please download the datasets below to ./cache/original_datasets.

Datasets	Utilization	Download URL
HOI4D	train & eval	link
H2O	train & eval	link
FPHA	train	link
EgoPAT3D	train	link
Epic-Kitchen	train	link
ARCTIC	eval	link
POV-Surgery	eval	link

For datasets pre-process, it will automaically conducts for the first time we start the training scripts. If we want to process each datasets manually, we could first change the configuration file in ./egomono4d/datagen.py and run python -m egomono4d.datagen. For example, if we want to process Epic-Kitchen independently, we could first change Line28 of datagen.py as config_name="datagen_pov_surgery and then run python -m egomono4d.datagen.

More details could be found in ./config and ./egomono4d/datagen.py. The processed results will be saved in ./cache/processed_datasets automatically. After data pre-process, we could start the traning with:

python -m egomono4d.pretrain wandb.name=egomono4d_train

The training results will be saved in ./cache/models.

Evaluation

For evaluation, please run:

python -m egomono4d.evaluation -f $MODEL_FOLDER -b $BATCH_SIZE -c $CONFIG_PATH

where $MODEL_FOLDER is the folder of model checkpoints, $CONFIG_PATH is the path of evaluation configuration file, which could be found in ./config (e.g. ./config/pretrain_eval_hoi4d.yaml). The evaluation results will be saved in a json file.

Acknowledgment

This repository is based on the code from FlowMap, EgoHOI, UniDepth, GMFlow, CoTrackerV2 and Viser. We sincerely appreciate their contribution to the open-source community, which have significantly supported this project.

Citation

If you find this repository useful, please kindly acknowledge our work :

@article{yuan2024self-supervised,
  title={Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos},
  author={Yuan, Chengbo and Chen, Geng and Yi, Li and Gao, Yang},
  journal={arXiv preprint arXiv:2411.09145},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
config		config
egomono4d		egomono4d
examples		examples
lightning_logs		lightning_logs
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
cache.zip		cache.zip
pyproject.toml		pyproject.toml
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos

Installation

Install Conda Environment

Install UniDepthV1.5 and CoTrackerV2

Install Dependencies

Handle Pretrained Models Through Cache Folder

Getting Start

Model Training

Evaluation

Acknowledgment

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

michaelyuancb/egomono4d

Folders and files

Latest commit

History

Repository files navigation

EgoMono4D: Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos

Installation

Install Conda Environment

Install UniDepthV1.5 and CoTrackerV2

Install Dependencies

Handle Pretrained Models Through Cache Folder

Getting Start

Model Training

Evaluation

Acknowledgment

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages