Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

This is the PyTorch implementation of our CVPR 2025 paper:
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie, Anas Mahmoud, Steven Waslander
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper]

DINO Teacher is a domain adaptive object detection method that leverages VFMs as a source of pseudo-labels and for cross-domain alignment. Our work is based off Adaptive Teacher.

Installation

Please refer to INSTALL.md for the installation of DINO Teacher.

Training

Train the DINO labeller (you can replace the test datasets).

python train_net.py\
      --num-gpus 2\
      --config configs/vit_labeller.yaml\
      OUTPUT_DIR output/dino_label/test_vitl\
      SOLVER.IMG_PER_BATCH_LABEL 8\
      DATASETS.TEST '("cityscapes_val","cityscapes_foggy_val","BDD_day_val")'\
      SEMISUPNET.DINO_BBONE_MODEL dinov2_vitl14

Generate the target domain pseudo-labels. Note that we evaluate on the train split (DATASETS.TEST=("BDD_day_train",)) to generate the train split pseudo-labels. We use the checkpoint resuming function, and so you should set the desired model by specifying the OUTPUT_DIR config variable and setting the desired checkpoint in the last_checkpoint file. The SEMISUPNET.DINO_BBONE_MODEL parameter initializes the ViT model and must match the size of the checkpoint for parameter loading. We evalate on a single GPU.

python train_net.py\
      --num-gpus 1\
      --resume\
      --gen-labels\
      --config configs/vit_labeller.yaml\
      OUTPUT_DIR output/dino_label/test_vitl\
      DATASETS.TEST '("BDD_day_train",)'\
      SEMISUPNET.DINO_BBONE_MODEL dinov2_vitl14

Run DINO Teacher on the desired target domain. You may have to specify the correct path to the labeller annotations.

python train_net.py\
      --num-gpus 2\
      --resume\
      --config configs/vgg_city2bdd.yaml\
      SEMISUPNET.LABELER_TARGET_PSEUDOGT output/dino_label/test_vitl/predictions/BDD_day_train_dino_anno_vitl.pkl

Results and Weights

DINO ViT Labellers

The DINO labellers are all trained on the original Cityscapes only. All results are [email protected].

Backbone	Cityscapes	Foggy Cityscapes	BDD100k	Weights	Forward Pass Labels
ViT-L	61.3	54.6	45.7	link	FCS, BDD
ViT-G	64.3	58.8	51.1	link	FCS, BDD

Student Models

The student models are trained on the source Cityscapes with ground truth before using the DINO labellers pseudo-labels on the target domain.

Target Domain	Backbone	Labeller Size	Align. Teacher	[email protected]	Weights
Foggy Cityscapes	VGG	ViT-G	ViT-B	55.4	link
BDD100k	VGG	ViT-G	ViT-B	47.8	[link]https://drive.google.com/file/d/1EG-ldsKT5VjEck3Ke0uAACwWEOoeWrJe/view?usp=drive_link)

Citation

If you use DINO Teacher in your research, please consider citing:

@article{lavoie2025large,
  title={Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection},
  author={Lavoie, Marc-Antoine and Mahmoud, Anas and Waslander, Steven L},
  journal={arXiv preprint arXiv:2503.23220},
  year={2025}
}

License

DINO Teacher is released under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
adapteacher		adapteacher
configs		configs
dinoteacher		dinoteacher
dinov1/hub		dinov1/hub
dinov2		dinov2
.gitignore		.gitignore
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
parse_acdc_annos.py		parse_acdc_annos.py
parse_bdd_annos.py		parse_bdd_annos.py
schematic.png		schematic.png
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

Installation

Training

Results and Weights

DINO ViT Labellers

Student Models

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

TRAILab/DINO_Teacher

Folders and files

Latest commit

History

Repository files navigation

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

Installation

Training

Results and Weights

DINO ViT Labellers

Student Models

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages