Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TRAILab/DINO_Teacher

Repository files navigation

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

This is the PyTorch implementation of our CVPR 2025 paper:
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie, Anas Mahmoud, Steven Waslander
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper]

DINO Teacher is a domain adaptive object detection method that leverages VFMs as a source of pseudo-labels and for cross-domain alignment. Our work is based off Adaptive Teacher.

Installation

Please refer to INSTALL.md for the installation of DINO Teacher.

Training

  • Train the DINO labeller (you can replace the test datasets).
python train_net.py\
      --num-gpus 2\
      --config configs/vit_labeller.yaml\
      OUTPUT_DIR output/dino_label/test_vitl\
      SOLVER.IMG_PER_BATCH_LABEL 8\
      DATASETS.TEST '("cityscapes_val","cityscapes_foggy_val","BDD_day_val")'\
      SEMISUPNET.DINO_BBONE_MODEL dinov2_vitl14
  • Generate the target domain pseudo-labels. Note that we evaluate on the train split (DATASETS.TEST=("BDD_day_train",)) to generate the train split pseudo-labels. We use the checkpoint resuming function, and so you should set the desired model by specifying the OUTPUT_DIR config variable and setting the desired checkpoint in the last_checkpoint file. The SEMISUPNET.DINO_BBONE_MODEL parameter initializes the ViT model and must match the size of the checkpoint for parameter loading. We evalate on a single GPU.
python train_net.py\
      --num-gpus 1\
      --resume\
      --gen-labels\
      --config configs/vit_labeller.yaml\
      OUTPUT_DIR output/dino_label/test_vitl\
      DATASETS.TEST '("BDD_day_train",)'\
      SEMISUPNET.DINO_BBONE_MODEL dinov2_vitl14
  • Run DINO Teacher on the desired target domain. You may have to specify the correct path to the labeller annotations.
python train_net.py\
      --num-gpus 2\
      --resume\
      --config configs/vgg_city2bdd.yaml\
      SEMISUPNET.LABELER_TARGET_PSEUDOGT output/dino_label/test_vitl/predictions/BDD_day_train_dino_anno_vitl.pkl

Results and Weights

DINO ViT Labellers

The DINO labellers are all trained on the original Cityscapes only. All results are [email protected].

Backbone Cityscapes Foggy Cityscapes BDD100k Weights Forward Pass Labels
ViT-L 61.3 54.6 45.7 link FCS, BDD
ViT-G 64.3 58.8 51.1 link FCS, BDD

Student Models

The student models are trained on the source Cityscapes with ground truth before using the DINO labellers pseudo-labels on the target domain.

Target Domain Backbone Labeller Size Align. Teacher [email protected] Weights
Foggy Cityscapes VGG ViT-G ViT-B 55.4 link
BDD100k VGG ViT-G ViT-B 47.8 [link]https://drive.google.com/file/d/1EG-ldsKT5VjEck3Ke0uAACwWEOoeWrJe/view?usp=drive_link)

Citation

If you use DINO Teacher in your research, please consider citing:

@article{lavoie2025large,
  title={Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection},
  author={Lavoie, Marc-Antoine and Mahmoud, Anas and Waslander, Steven L},
  journal={arXiv preprint arXiv:2503.23220},
  year={2025}
}

License

DINO Teacher is released under the Apache 2.0 license.

About

Source code for "Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages