[ICLR 2025 Spotlight] Uncovering Overfitting in Large Language Model Editing
-
At least a GPU with no less than 48G memory is needed.
-
For the environment, run:
conda create -n evoke python=3.9.7
pip install -r requirements.txtAn example for editing GPT-J with ROME-LTI on EVOKE dataset:
python -m experiments.evaluate_evoke_main \
--alg_name=ROME-LTI \
--model_name=[path/to/your/gpt-j/model] \
--hparams_fname=gpt-j-6b.json \
--ds_name=evoke-main \
--num_edits=1Computing the covariance matrix estimation
Use experiments.evaluate_evoke_subj_spec to get the results on Subject Specificity task. To summarize the results, use experiments/summarize.py:
python -m experiments.summarize --dir_name=ROME-LTI --runs=run_<run1>The code we conduct our experiments is based on MEMIT.
For ROME and MEMIT, we use precomputed Wikipedia stats on GPT-2 XL and GPT-J from kmeng01/rome and stats on Llama-2-7b from mjy1111/PEAK. Thanks to their contributions!
If you find this work helpful for your research, please kindly cite it.
@inproceedings{
zhang2025uncovering,
title={Uncovering Overfitting in Large Language Model Editing},
author={Mengqi Zhang and Xiaotian Ye and Qiang Liu and Shu Wu and Pengjie Ren and Zhumin Chen},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=t8qcGXaepr}
}