DriveEditor

A diffusion-based unified framework capable of repositioning, inserting, replacing, and deleting objects in driving scenario videos.

Arxiv | Project Page

⚙️ Installation

1. Clone this repo.

git clone [email protected]:yvanliang/DriveEditor.git

2. Set up the conda environment.

conda create -n DriveEditor python=3.10 -y
conda activate DriveEditor
pip install torch==2.1.1 torchvision==0.16.1 xformers==0.0.23 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
pip install .
pip install -e git+https://github.com/Stability-AI/datapipelines.git@main#egg=sdata  # install sdata for training

🔩 Model and Data Preparation

For inference

Download the demo data from Google Drive and place it in the checkpoints directory.
Download the pretrained models from Google Drive and place it in the checkpoints directory.

For training

Download sv3d_p.safetensors and svd.safetensors from huggingface model hub and place them in the checkpoints directory.
Execute the following command to combine the two models:
```
python scripts/combine_ckpts.py
```
Download the toy training data from Google Drive and extract it into the checkpoints directory to obtain train_data.pkl.

⚡ Use This model

We provide a gradio demo for editing driving scenario videos. To run the demo, a GPU with more than 32GB of VRAM is required. Execute the following command:
```
python interactive_gui.py
```

If you don't have a GPU with more than 32GB of VRAM but have two 24GB VRAM GPUs, you can use both GPUs for inference, although it will take more time. First, modify sgm/modules/diffusionmodules/video_model.py:

At line 684, add:

h_out_3d = h_out_3d.to(x.device)
hs_3d_all = [t.to(x.device) for t in hs_3d_all]

At line 794, add:

x = x.to("cuda:1")
timesteps = timesteps.to("cuda:1")
context = context.to("cuda:1")
y = y.to("cuda:1")
if time_context is not None:
    time_context = time_context.to("cuda:1")
image_only_indicator = image_only_indicator.to("cuda:1")

Then run the following command:
```
python interactive_gui_2gpu.py
```

🚀 Train the model

To train the model, execute the following command:

python main.py -b configs/train.yaml --wandb --enable_tf32 True --no-test

🌟 Acknowledgements

We appreciate the releasing code of Stable Video Diffusion and ChatSim.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
data		data
model_licenses		model_licenses
scripts		scripts
sgm		sgm
.gitignore		.gitignore
LICENSE-CODE		LICENSE-CODE
README.md		README.md
interactive_gui.py		interactive_gui.py
interactive_gui_2gpu.py		interactive_gui_2gpu.py
main.py		main.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DriveEditor

⚙️ Installation

1. Clone this repo.

2. Set up the conda environment.

🔩 Model and Data Preparation

For inference

For training

⚡ Use This model

🚀 Train the model

🌟 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

yvanliang/DriveEditor

Folders and files

Latest commit

History

Repository files navigation

DriveEditor

⚙️ Installation

1. Clone this repo.

2. Set up the conda environment.

🔩 Model and Data Preparation

For inference

For training

⚡ Use This model

🚀 Train the model

🌟 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages