This repository provides the official implementation. It supports deletion and addition attacks with human-readable rationales, and provides ready-to-run scripts for reproducible experiments. For more details, please refer to our paper.
The formatting is currently messy, and it will be refined after the paper is accepted.
We recommend using Python 3.10+. Higher versions should also be compatible. To install dependencies, run:
pip install -r requirements.txtThe datasets have already been included in kg/. We also provide LoRA weights of two datasets for reproduction.
You can find ready-to-run shell scripts in scripts/.
hoa_sft.shsft for knowledge alignment by triple classfication.llm_del_filter.shfilters the candidate entities.llm_del.shperforms the deletion attack.
llm_add_filter.shfilters the candidate entities.llm_add.shperforms the addition attack.
We thank the authors of these open-source projects for their contributions: AttributionAttack, KoPA, KG-LLM