Amin Parchami-Araghi · Sukrut Rao · Jonas Fischer · Bernt Schiele
Paper | Video (soon)
git clone https://github.com/m-parchami/FaCT.git
conda create --name fact python=3.10
conda activate fact
cd FaCT; pip install -r requirements.txtYou can then either activate run conda activate fact before every script, or replace the CONDA variable in the script and point it towards the installed binary.
The training implementation is provided under fact.training. You can use train.sh for sample training scripts for different layers and architectures. The script has the commands for both local runs, as well as slurm jobs.
bash train.shFor every new layer or architecture that you visit, you would need to run the train script once to generate the features, and then afterwards run again, but with the --precomputed-feats pointing to the path of the feature tensor. Currently the code is configured to keep the size of the stored dataset to be 170GB (it adjusts itself based on number of dimensions). If 170 GB is too much for your setup, simply reduce number of feature samples per image here or specify a low --num_batches when collecting training features. I would recommend reducing the number of samples per image, so that the whole dataset is covered. I will soon add a less memory-intensive implementation of features, where it is stored in smaller tensor chunks that can be loaded independently.
We have released our FaCT models on Github. These are per-model and per-layer SAE checkpoints (Bias-free TopK-SAE) that would need to be plugged in to their corresponding B-cos-v2 checkpoints. All of our evaluation scripts take care of this. The correct B-cos-v2 checkpoint is downloaded based on the --arch argument that you pass.
All of the sample commands for this section (i.e model performance evaluation, concept consistency evaluation, and visualizations) are provided under analysis.sh. For all the evaluation scripts, if you mention --download v0.1 it would use our provided checkpoints, and if not, it would assume that you have trained your own and try to load it locally.
The fact.analysis.analysis evaluates the model accuracy and also collects activation and contribution statistics of concepts (over the test set). The statistics are required for the other evaluations below.
Using fact.analysis.dino_consistency you can measure how consistent the top-X % activations of the concepts are. Our metric uses DINOv2 features and upsamples them with LoftUP. You can either install LoftUP as a package, or have a clone of their repo in the root directory. In principle, you can replace both the foundation model and up-sampler with other alternatives if you wish, as long as they perform well on semantic correpondence tasks.
You can use fact.plotting.plot_inference to either visualize concepts or visualize the decision making, by measuring concept contributions. There are examples for both provided in the main function of fact.plotting.plot_inference. This code also relies on the statistics from the above, to know which images activate the concepts highest.
I'd more than happy to help you with any questions you have regarding the implementation or the paper :) Simply create an issue here or write me an email.
If you use our models, metric, or implementation, kindly cite us under following:
@inproceedings{
parchami2025fact,
title={Fa{CT}: Faithful Concept Traces for Explaining Neural Network Decisions},
author={Amin Parchami-Araghi and Sukrut Rao and Jonas Fischer and Bernt Schiele},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
}