MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval [Paper]

by Seojeong Park¹, Jiho Choi¹, Kyungjune Baek², Hyunjung Shim¹

¹ Korea Advanced Institute of Science and Technology (KAIST), ² Sejong University

🔍 Overview

This repository provides the official implementation of MomentMix + Length-Aware Decoder (LAD) for improving short moment retrieval in Video Moment Retrieval tasks.

MomentMix – A two-stage temporal data augmentation:
- ForegroundMix – Splits long moments into shorter segments and shuffles them to enhance recognition of query-relevant frames.
- BackgroundMix – Preserves the foreground while replacing background regions with segments from other videos, improving discrimination between relevant and irrelevant frames.
Length-Aware Decoder (LAD) – Uses length-wise bipartite matching to pair predictions and ground truths within the same length category (short / middle / long), enabling length-specific decoder queries.

📂 Contents

📦 Installation & Data Setup

Before training or evaluation, please make sure both the datasets and the runtime environment are ready.

1. Get the datasets ready

QVHighlights and the other benchmark datasets can be obtained by following the guidelines from QD-DETR.

2. Set up the code environment

git clone https://github.com/sjpark5800/LA-DETR
cd LA-DETR

conda create -n ladetr python=3.10
conda activate ladetr

pip install -r requirements.txt

🎯 MomentMix : Data Augmentation

The dataset required for MomentMix is available in the data/ directory.
You can either:

Use the pre-generated dataset (recommended for reproducing our results), or
Generate your own by following the detailed instructions in momentmix/README.md.

For experimental consistency, we strongly recommend using the provided pre-generated dataset.

🚀 Training

The following scripts train our models with the proposed MomentMix and Length-Aware Decoder.

QVHighlight

# For LA-QD-DETR,
bash la_qd_detr/scripts/train.sh

# For LA-TR-DETR,
bash la_tr_detr/scripts/train.sh

# For LA-UVCOM,
bash la_uvcom/scripts/qv/train.sh

TACoS

bash la_uvcom/scripts/tacos/train.sh

Charades-STA

# SlowFast + CLIP
bash la_uvcom/scripts/cha/train.sh

# VGG + Glove
bash la_uvcom/scripts/cha_vgg/train.sh

🧪 Inference

# For LA-QD-DETR,
bash la_qd_detr/scripts/inference.sh {exp_dir}/model_best.ckpt 'val'
bash la_qd_detr/scripts/inference.sh {exp_dir}/model_best.ckpt 'test'

# For LA-TR-DETR,
bash la_tr_detr/scripts/inference.sh {exp_dir}/model_best.ckpt 'val'
bash la_tr_detr/scripts/inference.sh {exp_dir}/model_best.ckpt 'test'

# For LA-UVCOM,
bash la_uvcom/scripts/inference.sh {exp_dir}/model_best.ckpt 'val'
bash la_uvcom/scripts/inference.sh {exp_dir}/model_best.ckpt 'test'

{exp_dir} refers to the directory containing the trained model checkpoint and logs.

Note: For test results, please refer to Moment-DETR evaluation.

📦 Checkpoint

We release pre-trained checkpoints and training logs for all reported experiments to ensure reproducibility. All model configurations are fully documented in the corresponding opt.json file.

📁 Download all checkpoints & logs here

Dataset	Method	Model file
QVHighlights	QD-DETR + Ours	🔗 checkpoint & log
QVHighlights	TR-DETR + Ours	🔗 checkpoint & log
QVHighlights	UVCOM + Ours	🔗 checkpoint & log
TACoS	UVCOM + Ours	🔗 checkpoint & log
Charades	UVCOM + Ours	🔗 checkpoint & log
Charades (VGG)	UVCOM + Ours	🔗 checkpoint & log

📚 Citation

If you find this work useful, please cite:

@article{park2024length,
  title={Length-Aware DETR for Robust Moment Retrieval},
  author={Park, Seojeong and Choi, Jiho and Baek, Kyungjune and Shim, Hyunjung},
  journal={arXiv preprint arXiv:2412.20816},
  year={2024}
}

📜 License

All code in this repository is released under the MIT License.

🙏 Acknowledgements

Parts of the annotation files and several implementation components are adapted from Moment-DETR, QD-DETR, TR-DETR, and UVCOM.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
data		data
la_qd_detr		la_qd_detr
la_tr_detr		la_tr_detr
la_uvcom		la_uvcom
momentmix		momentmix
standalone_eval		standalone_eval
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval [Paper]

🔍 Overview

📂 Contents

📦 Installation & Data Setup

1. Get the datasets ready

2. Set up the code environment

🎯 MomentMix : Data Augmentation

🚀 Training

QVHighlight

TACoS

Charades-STA

🧪 Inference

📦 Checkpoint

📚 Citation

📜 License

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

License

Uh oh!

sjpark5800/LA-DETR

Folders and files

Latest commit

History

Repository files navigation

MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval [Paper]

🔍 Overview

📂 Contents

📦 Installation & Data Setup

1. Get the datasets ready

2. Set up the code environment

🎯 MomentMix : Data Augmentation

🚀 Training

QVHighlight

TACoS

Charades-STA

🧪 Inference

📦 Checkpoint

📚 Citation

📜 License

🙏 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages