I-GCG

The official repository for Improved Techniques for Optimization-Based Jailbreaking on Large Language Models.

Please feel free to contact [email protected] if you have any question.

Quick Start

1. Generate suffix initialization

python attack_llm_core_best_update_our_target.py --behaviors_config=behaviors_ours_config.json

2. Generate new json with the initialization

python generate_our_config.py

3. Conduct jailbreaking attack

python run_multiple_attack_our_target.py --behaviors_config=behaviours_gcss_config_init_v2_continued.json --output_path=gcss --model_path="/home/LLM/Llama-2-7b-chat-hf"

Experiments

Comparison results with SOTA jailbreak methods

Transferable performance of jailbreak suffix

Citation

Kindly include a reference to this paper in your publications if it helps your research:

@article{jia2024improved,
  title={Improved Techniques for Optimization-Based Jailbreaking on Large Language Models}, 
      author={Xiaojun Jia and Tianyu Pang and Chao Du and Yihao Huang and Jindong Gu and Yang Liu and Xiaochun Cao and Min Lin},
      year={2024},
      eprint={2405.21018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
Our_GCG_target_len_20		Our_GCG_target_len_20
imgs		imgs
llm_attacks		llm_attacks
output_base		output_base
output_update_target/7		output_update_target/7
.gitignore		.gitignore
README.md		README.md
attack_llm_core_base.py		attack_llm_core_base.py
attack_llm_core_best_update_our_target.py		attack_llm_core_best_update_our_target.py
behaviors_config.json		behaviors_config.json
behaviors_ours_config.json		behaviors_ours_config.json
behaviors_ours_config_init.json		behaviors_ours_config_init.json
behaviors_ours_config_int.json		behaviors_ours_config_int.json
behaviours_gcss_config_init.json		behaviours_gcss_config_init.json
behaviours_gcss_config_init_continued.json		behaviours_gcss_config_init_continued.json
behaviours_gcss_config_init_continued2.json		behaviours_gcss_config_init_continued2.json
behaviours_gcss_config_init_continued3.json		behaviours_gcss_config_init_continued3.json
behaviours_gcss_config_init_v2.json		behaviours_gcss_config_init_v2.json
behaviours_gcss_config_init_v2_continued.json		behaviours_gcss_config_init_v2_continued.json
check_gpu.py		check_gpu.py
convert_log_results_to_config.py		convert_log_results_to_config.py
export_top_log_results.py		export_top_log_results.py
generate_gcss_config.py		generate_gcss_config.py
generate_our_config.py		generate_our_config.py
main.py		main.py
requirements.txt		requirements.txt
run_multiple_attack_our_target.py		run_multiple_attack_our_target.py
run_single_attack_base.py		run_single_attack_base.py
setup.sh		setup.sh
target_harmful_behaviours.json		target_harmful_behaviours.json
target_harmful_behaviours_rephrase.json		target_harmful_behaviours_rephrase.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

I-GCG

Quick Start

1. Generate suffix initialization

2. Generate new json with the initialization

3. Conduct jailbreaking attack

Experiments

Comparison results with SOTA jailbreak methods

Transferable performance of jailbreak suffix

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

timlrx/IGCG

Folders and files

Latest commit

History

Repository files navigation

I-GCG

Quick Start

1. Generate suffix initialization

2. Generate new json with the initialization

3. Conduct jailbreaking attack

Experiments

Comparison results with SOTA jailbreak methods

Transferable performance of jailbreak suffix

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages