CRCE is a novel concept erasure framework for text-to-image diffusion models that handles coreferential concepts (synonyms, related terms) to prevent bypass attacks while preserving model utility.
CRCE uses a multi-objective loss function:
L_total = L_anchor + α·L_coref + β·L_retain
- L_anchor: Erases target concept
- L_coref: Erases related concepts (synonyms, variations)
- L_retain: Preserves unrelated concepts
git clone https://github.com/vios-s/CRCE-Coreference-Retention-Concept-Erasure-in-Text-to-Image-Diffusion-Models
cd CRCE
pip install -r requirements.txt
from srcs.ours_tools import execute_ours_unlearn
# Erase "airplane" and related concepts while preserving other flying objects
result = execute_ours_unlearn(
erase_concept="airplane",
coref_concept="aeroplane,plane,jet plane,passenger plane",
retain_concept="hot air balloon,blimp,rocket,drone",
iterations=500,
train_method='xattn-strict'
)
For automated experiments with LLM-guided concept identification:
python srcs/main.py # Requires llmconfig.json with API keys
Pre-curated concept sets for reproducible experiments:
CorefConcept/object.json
- CIFAR-10 based objectsCorefConcept/celebrity.json
- Public figuresCorefConcept/ip.json
- Intellectual property
CRCE achieves 95%+ target concept removal while maintaining 85%+ retention quality, with 2-5x faster training compared to full model fine-tuning.
@inproceedings{xue2025crce,
title={CRCE: Coreference Retention Concept Erasure in Text-to-Image Diffusion Models},
author={Xue, Yuyang and Moroshko, Edward and Chen, Feng and Sun, Jingyu and McDonagh, Steven and Tsaftaris, Sotirios A},
booktitle={British Machine Vision Conference (BMVC)},
year={2025}
}
MIT License - see LICENSE file for details.