The implementation of the paper Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors, arXiv.
Guangyao Zhai, Yue Zhou, Xinyan Deng, Lars Heckler, Nassir Navab, and Benjamin Busam
Technical University of Munich • MVTec Software GmbH
All Python dependencies are listed in requirements.txt. We recommend Python ≥ 3.10.
conda create -n foundad python=3.10
conda activate foundad
git clone [email protected]:ymxlzgy/FoundAD.git
cd FoundAD
pip install -r requirements.txt
pip install -e .Before we start, please make sure you have the rights to use DINOv3. Download our trained manifold projectors, and put them to ./logs/.
| DINOv3-based | 1-shot | 2-shot | 4-shot |
|---|---|---|---|
| MVTec AD | ⬇️ link | ⬇️ link | ⬇️ link |
| VisA | ⬇️ link | ⬇️ link | ⬇️ link |
Run a demo on MVTec-AD
python foundad/main.py mode=demo app=test testing.segmentation_vis=True data.dataset=mvtec data.data_name=mvtec_1shot data.test_root=assets/mvtecOr a demo on VisA
python foundad/main.py mode=demo app=test testing.segmentation_vis=True data.dataset=visa data.data_name=visa_4shot data.test_root=assets/visa| Dataset | Preferred download |
|---|---|
| MVTec AD | Official site: Here |
| VisA | We use the structured dataset of RealNet. |
Create a few-shot subset with sample.py:
python foundad/src/sample.py source=/media/ymxlzgy/Data21/xinyan/visa target=/media/ymxlzgy/Data21/xinyan/visa_tmp seed=42 num_samples=2where source is the dataset folder, target is the folder of few-shot samples, and num_samples is the number of samples training models, e.g., 2 for 2-shot learning. seed can be adjusted to have multiple rounds of experiment.
python foundad/main.py mode=train data.batch_size=8 data.dataset=mvtec data.data_name=mvtec_1shot data.data_path=/media/ymxlzgy/Data21/xinyan app=train_dinov3 diy_name=dbugwhere data.dataset is "mvtec" or "visa", data.data_name is the folder name of few-shot samples, data.data_path is the path where the few-shot folder is at, app is "train_dinov3" or other model configs under configs/app/, and diy_name (optionally) is the post-fix name of the model saving directory. To adjust the layer, please specify app.meta.n_layer.
After training, run inference:
python foundad/main.py mode=AD data.dataset=mvtec data.data_name=mvtec_1shot diy_name=dbug data.test_root=/media/ymxlzgy/Data21/xinyan/mvtec app=test app.ckpt_step=1950where data.test_root is the dataset folder, and app is test_dinov2 or test_dinov3 under configs/app/. To adjust sample number K, please specify testing.K_top_mvtec and testing.K_top_visa.
This repo utilizes DINOv3, DINOv2, DINO, SigLIP, CLIP and DINOSigLIP. We also thank I-JEPA for the inspiration.