Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"

Notifications You must be signed in to change notification settings

elainew728/motion-edit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

MotionEdit hf_dataset hf_model Twitter proj_page

Yixin Wan1,2, Lei Ke1, Wenhao Yu1, Kai-Wei Chang2, Dong Yu1

1Tencent AI, Seattle   2University of California, Los Angeles

MotionNFT Examples 1

✨ Overview

MotionEdit is a novel dataset and benchmark for motion-centric image editing. We also propose MotionNFT (Motion-guided Negative-aware FineTuning), a post-training framework with motion alignment rewards to guide models on motion image editing task.

📣 News

  • [2025/12/11]: 🤩 We release MotionEdit, a novel dataset and benchmark for motion-centric image editing. Along with the dataset, we propose MotionNFT (Motion-guided Negative-aware FineTuning), a post-training framework with motion alignment rewards to guide models on motion editing task.

🔧 Usage

🧱 To Start: Environment Setup

Clone this github repository and switch to the directory.

git clone https://github.com/elainew728/motion-edit.git
cd motion-edit

Create and activate the conda environment with dependencies that supports inference and training.

  • Note: some models like UltraEdit requires specific dependencies on the diffusers library. Please refer to their official repository to resolve dependencies before running inference.
conda env create -f environment.yml
conda activate motionedit

Finally, configure your own huggingface token to access restricted models by modifying YOUR_HF_TOKEN_HERE in inference/run_image_editing.py.

🔹 Quick Single-Image Demo

If you just want to edit a single image with our MotionNFT checkpoint, place the original input image file and your text prompt (in .txt format, same file name as the image file) inside examples/input_examples/. Then, run examples/run_inference_single.py to inference on the input image with your prompt.

We have prepared 3 input images from our MotionEdit-Bench dataset in the examples/input_examples/ folder. Play around with them by running the following example code:

python examples/run_inference_single.py \
    --input_image examples/input_examples/512.jpg \
    --output_dir examples/output_examples

The script automatically loads examples/input_examples/512.txt when --prompt is omitted. You can still override the prompt or supply a local LoRA via --prompt/--lora_path if needed.

🚀 Training with MotionNFT

We are working on releasing and refining the training pipeline using our MotionNFT method. Stay tuned!

To run training code, first change your working directory to the train folder:

cd train

Step 0: Data Format

Please format your training data according to the following structure. Place your {}_metadata.jsonl files under the folder motionedit_data/ in the train/ directory.

Data Folder structure:

- motionedit_data
  - images/
     - YOUR_IMAGE_DATA
     - ...
  - train_metadata.jsonl
  - test_metadata.jsonl

train_metadata.jsonl and test_metadata.jsonl format:

{"prompt": "PROMPT", "image": ["INPUT_IMAGE_PATH", "TARGET_IMAGE_PATH"]}
...

Step 1: Deploy vLLM Reward Server

To set up the vLLM server for the MLLM feedback reward, first configure the path to your local Qwen2.5-VL-32B-Instruct model checkpoint by modifying YOUR_MODEL_PATH in train/reward_server/reward_server.py.

Then, you can start the reward server:

python reward_server/reward_server.py

Step 2: Configure Training

See train/config/qwen_image_edit_nft.py and train/config/kontext_nft.py for available configurations.

Step 3: Run Training

export REWARD_SERVER=[YOUR_REWARD_SERVICE_IP_ADDR]:12341
RANK=[MACHINE_RANK]
MASTER_ADDR=[MASTER_ADDR]
MASTER_PORT=[MASTER_PORT]

accelerate launch --config_file flow_grpo/accelerate_configs/deepspeed_zero2.yaml \
    --num_machines 2 --num_processes 16 \
    --machine_rank ${RANK} --main_process_ip ${MASTER_ADDR} --main_process_port ${MASTER_PORT} \
    scripts/train_nft_qwen_image_edit.py --config config/qwen_image_edit_nft.py:qwen_motion_edit_reward 

🔍 Large-Scale Inferencing on MotionEdit-Bench with Image Editing Models

We have released our MotionEdit-Bench on Huggingface. In this Github Repository, we provide code that supports easy inference across open-source Image Editing models: Qwen-Image-Edit, Flux.1 Kontext [Dev], InstructPix2Pix, HQ-Edit, Step1X-Edit, UltraEdit, MagicBrush, and AnyEdit.

Step 1: Data Preparation

The inference script default to using our MotionEdit-Bench, which will download the dataset from Huggingface. You can specify a cache_dir for storing the cached data.

Additionally, you can construct your own dataset for inference. Please organize all input images into a folder INPUT_FOLDER and create a metadata.jsonl in the same directory. The metadata.jsonl file must at least contain entries with 2 entries:

{
    "file_name": IMAGE_NAME.EXT,
    "prompt": PROMPT
}

Then, load your dataset by:

from datasets import load_dataset
dataset = load_dataset("imagefolder", data_dir=INPUT_FOLDER)

Step 2: Running Inference

Use the following command to run inference on MotionEdit-Bench with our MotionNFT checkpoint, trained on MotionEdit with Qwen-Image-Edit as the base model:

python inference/run_image_editing.py \
    -o "./outputs/" \
    -m "motionedit" \
    --seed 42

Alternatively, our code supports inferencing multiple open-source image editing models. You can run inference on model of your choice by specifying in the arguments. For instance, here's a sample script for running inference on Qwen-Image-Edit:

python inference/run_image_editing.py \
    -o "./outputs/" \
    -m "qwen-image-edit" \
    --seed 42

✏️ Citing

Please consider citing our paper if you find our research useful. We appreciate your recognition!

@article{motionedit,
      title={MotionEdit: Benchmarking and Learning Motion-Centric Image Editing}, 
      author={Yixin Wan and Lei Ke and Wenhao Yu and Kai-Wei Chang and Dong Yu},
      year={2025},
      journal={arXiv preprint arXiv:2512.10284},
}

About

Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages