CVPR 2025
Reza Qorbani3 *
Gianluca Villani1,2 *
Theodoros Panagiotakopoulos1,5
Marc Botet Colomer1
Linus Härenstam-Nielsen6,7
Mattia Segu1,9
Pier Luigi Dovesi1,4 †
Jussi Karlgren1,4
Daniel Cremers6,7
Federico Tombari6,8
Matteo Poggi10
1The Good AI Lab
2University of Toronto
3KTH
4AMD Silo AI
5King
6Technical University of Munich
7Munich Center for Machine Learning
8Google
9ETH Zurich
10University of Bologna
*Joint first authorship †Project Lead
This repository contains the official implementation of our paper "Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation" published in CVPR 2025.
Semantic Library Adaptation (SemLA) is a training-free, test-time domain adaptation framework that dynamically retrieves and merges the most relevant LoRA adapters from a library based on semantic similarity to the target domain. SemLA constructs tailored models for each input without additional training, offering scalability, explainability, and privacy preservation.
The repository is structured as follows:
- catseg/: Contains the implementation of the CAT-Seg model that is used as the open-vocabulary semantic segmentation backbone. The model checkpoint, adapters and generated embeddings should be stored in this directory as well.
- domain_orchestrator/: Contains the implementation of the main experiments reported in our paper. These experiments are
Zero-shot,Oracle,Uniform Merge, andSemlafrom the Table 1. in the paper. - experiments.py: Script to run the experiments.
- First, follow the environment setup instructions in the installation guide.
- Once your environment is set up, install the required packages using the
requirements.txtfile located in the root directory of this repository:
pip install -r requirements.txtPlease follow the instructions in catseg/datasets/README.md for setting up the core datasets. For the additional datasets used in our experiments, follow the dataset-specific instructions available on their official websites:
IMPORTANT: Please store all datasets in the directory specified by the environment variable
$DETECTRON2_DATASETS. Please check thedomain_argsfunction in domain_orchestrator/utils.py for the expected directory name and structure for each dataset.
We followed the same setup as in TokenFusion for the NYU Depth V2 dataset. The dataset should be downloaded from here and extracted to the directory $DETECTRON2_DATASETS/nyudv2/. Then copy the train.txt and test.txt from /misc/nyudv2_splits/ to the same directory ($DETECTRON2_DATASETS/nyudv2/). The directory structure should look like this:
$DETECTRON2_DATASETS/nyudv2/
├── depth/
├── mask/
├── rgb/
├── train.txt
└── test.txtTo setup PASCAL Context 59 dataset, please follow the instruction in FC-CLIP repository instead of the instructions in CAT-Seg README. Once you have donwloaded the dataset and the annotations, and installed Detail API, please run the preparation script prepare_pascal_ctx_sem_seg.py in the catseg/datasets/ folder.
In our benchmark, we use COCONut-Large dataset. Please check out this issue for some clarification on the exepected size of the dataset. After gathering all the data necessary, segmentation maps can be extracted from the panoptic labels using this script. Instructions regarding the extraction procedure can be found in this README.
We used CAT-Seg (L) as our primary backbone. Please follow the instructions here to download the checkpoint. Once downloaded, place the checkpoint in the directory catseg/models/ and rename the checkpoint to model_final.pth. In other words, the relative path to the checkpoint should be:
catseg/
├── models/
│ └── model_final.pthWe provide trained LoRAs which can be downloaded from here. Please extract the folder containing the LoRAs to catseg/loradb/. This folder contains one sub-folder for each adapter such that, once properly set up, the relative path to the LoRA should be:
catseg/
├── loradb/
│ ├── a150/
│ ├── acdc-fog/
│ ├── acdc-night/
│ ├── acdc-rain/
│ ├── ...IMPORTANT: Please do not alter the names of these folders, otherwise the experiments will not work properly.
IMPORTANT: Before training or evaluating new LoRAs, please change the current directory to
catseg/by runningcd catseg/.
Please first read the instructions in the CAT-Seg README to understand how to train and evaluate the CAT-Seg model. Fine-tuning the CAT-Seg model using LoRAs is very similar to the standard training process.
We provide a convenient script: run_lora.sh for training LoRA adapters. This script simplify the process of domain adaptation through parameter-efficient fine-tuning.
sh run_lora.sh [CONFIG] [NUM_GPUS] [OUTPUT_DIR] [OPTS] MODEL.LORA.NAME [ADAPTER_NAME]For example, to train a LoRA on the acdc-rain domain with default parameters, run the following command:
sh run_lora.sh configs/acdc/rain/lora-rain-acdc.yaml 1 output/acdc-rain/ MODEL.LORA.NAME acdc-rainThis command trains a LoRA adapter for the ACDC rain domain. It uses the settings in configs/acdc/rain/lora-rain-acdc.yaml. While the full model gets saved to output/acdc-rain/, you only need the small LoRA adapter that's automatically stored in catseg/loradb/acdc-rain/. The full model file can be deleted after training since only the adapter is needed for later use.
The LoRA-related hyperparamters, including to which modules LoRAs should be attached to, the rank and default LoRA storage path, can all be found in the catseg/cat_seg/config.py file.
To change the default hyperaparamters, you can either change the values in the config files (e.g., configs/acdc/rain/lora-rain-acdc.yaml for model paramters or catseg/cat_seg/config.py for LoRA-related hyperaparamters) or pass the parameters and their new values as arguments to the training script directly. For example, to change the rank of the LoRA and the number of iterations you can run the following command:
sh run_lora.sh configs/acdc/rain/lora-rain-acdc.yaml 1 output/acdc-rain/ MODEL.LORA.NAME acdc-rain MODEL.LORA.RANK 10 SOLVER.MAX_ITER 1000Evaluation script has a similar format to the training script. We provide a convenient script: eval_lora.sh for evaluating LoRA adapters. This script simplifies the process of evaluating LoRA adapters.
sh eval_lora.sh [CONFIG] [NUM_GPUS] [OUTPUT_DIR] [OPTS] MODEL.LORA.NAME [ADAPTER_NAME]To evaluate a LoRA on the acdc-rain domain, run the following command:
sh eval_lora.sh configs/acdc/rain/lora-rain-acdc.yaml 1 output/acdc-rain/ MODEL.LORA.NAME acdc-rainThis command will attach the saved LoRA to the model and evaluate it on the acdc-rain domain.
The framework supports four different domain adaptation methods:
-
Zero-shot: Evaluates a model trained on source domains directly on target domains without adaptation. Serves as a baseline.
-
Oracle: Uses a single LoRA adapter specifically trained on the target domain. When testing on domain
x, uses only the adapter trained onx. -
Uniform Merge: Combines source domain adapters with equal weights. Simple strategy requiring no target domain data for weight calculation.
-
SemLA: Dynamically weights source domain adapters based on similarity to the target domain. Weights are calculated during inference for each image.
Experiments are executed using the experiments.py script.
Prior to running experiments, the following YAML configuration files must be prepared:
List of source domains (source_domains.yaml):
- domain1
- domain2
- domain3
- ...List of target domains used for evaluation (target_domains.yaml):
- domain1
- domain2
- ...In order to run the experiments, the domain names in the source_domains.yaml and target_domains.yaml files must adhere to the following naming convention:
| Domain Name | Description |
|---|---|
| muses-rain-day | MUSES dataset images of rainy conditions during daylight |
| muses-rain-night | MUSES dataset images of rainy conditions at night |
| muses-clear-day | MUSES dataset images of clear weather during daylight |
| muses-clear-night | MUSES dataset images of clear weather at night |
| muses-snow-day | MUSES dataset images of snowy conditions during daylight |
| muses-snow-night | MUSES dataset images of snowy conditions at night |
| muses-fog-day | MUSES dataset images of foggy conditions during daylight |
| muses-fog-night | MUSES dataset images of foggy conditions at night |
| acdc-rain | ACDC dataset images of rainy weather conditions |
| acdc-fog | ACDC dataset images of foggy weather conditions |
| acdc-night | ACDC dataset images captured at night |
| acdc-snow | ACDC dataset images of snowy weather conditions |
| cs-normal | Cityscapes dataset |
| bdd | BDD100K (10k subset) |
| mv | Mapillary Vistas |
| a150 | ADE20K dataset with 150 semantic classes |
| idd | India Driving Dataset |
| pc59 | PASCAL Context dataset with 59 semantic segmentation categories |
| nyu | NYU-Depth V2 dataset |
| coconutL | COCONUT-L dataset |
Parameters for experiments (semla_config.yaml):
distance_measure_name: euclidean # or cosine
temperature: 0.05
top_k: 5
combination_type: catBefore running the experiments, embeddings for each domain must be generated. This can be done by running the following command:
python domain_orchestrator/generate_embeddings.py --source_domains source_domains.yaml --lora_library_path catseg/loradb/ This script generates the average embedding for each domain specified in source_domains.yaml and saves them in the corresponding adapter folder catseg/loradb/ directory. The embeddings are used to calculate the similarity between the source and target domains during adaptation. For example for the domain acdc-rain, after generating the embeddings, the directory structure for the domain acdc-rain should look like this:
catseg/
├── loradb/
│ ├── acdc-rain/
│ │ ├── acdc-rain_statistics.npz
│ │ ├── adapter_config.json
│ │ └── adapter_model.safetensors
│ ├── ...NOTE: This step can take a while to complete, depending on the number of domains and the size of the dataset.
The following command line arguments are available for the experiments.py script:
--experiment: Type of experiment (zeroshot,oracle,uniform, orsemla)--source_domains: Path to YAML file containing source domains--target_domains: Path to YAML file containing target domains--semla_config: Path to YAML file containing configuration parameters (required forsemlaexperiment)--output_dir: Directory to save results--remove_target_adapter: Flag to exclude the target domain's adapter from the source adapters during merging. This ensures a fair evaluation by preventing the model from using knowledge specific to the target domain.
The script saves the following output files in the specified output directory:
-
results.json: Contains evaluation metrics (e.g., accuracy scores) for each target domain and experimental setting. -
weights.json: Contains the adapter weights used during the inference for each image when using the Online Merge (SEMLA) method.
In all commands below, the argument --source_domains specifies which domains to load for and the argument --target_domains specifies which domains to evaluate on. For the oracle, the target domain is also used as a source domain (i.e., no adapatation is performed) and for zeroshot, no source domain is used for adaptation, but the the source domains should match the specified target domains.
python experiments.py --experiment zeroshot \
--source_domains config/source_domains.yaml \
--target_domains config/target_domains.yaml \
--output_dir ./results/zeroshotpython experiments.py --experiment oracle \
--source_domains config/source_domains.yaml \
--target_domains config/target_domains.yaml \
--output_dir ./results/oraclepython experiments.py --experiment uniform \
--source_domains config/source_domains.yaml \
--target_domains config/target_domains.yaml \
--remove_target_adapter \
--output_dir ./results/uniformpython experiments.py --experiment semla \
--semla_config config/semla_config.yaml \
--source_domains config/source_domains.yaml \
--target_domains config/target_domains.yaml \
--remove_target_adapter \
--output_dir ./results/semlaSee LICENSE for details about the license terms for this repository. Portions of the project are under separate license terms. CAT-Seg is licensed under the MIT License, and the license can be found here. In addition, we use some files from Detectron2 and FC-Clip which are under the Apache-2.0 License and Mask2Fromer which is under the MIT License.
We would like to thank the authors CAT-Seg whose code has been utilized in this project.
If you find this repository useful in your research, please consider citing our paper:
@inproceedings{qorbani2025semla,
author = {Qorbani, Reza and Villani, Gianluca and Panagiotakopoulos, Theodoros and Botet Colomer, Marc and H{\"a}renstam-Nielsen, Linus and Segu, Mattia and Dovesi, Pier Luigi and Karlgren, Jussi and Cremers, Daniel and Tombari, Federico and Poggi, Matteo},
title = {Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2025}
}