A curated collection of open-source tools, benchmarks, datasets, and papers focused on the security of Large Language Models (LLMs) — covering prompt injection, jailbreaks, red-teaming, evaluations, and defenses.
🛡️ Stay secure. Test everything. Contribute often.
- 🛡️ Red Teaming & Attack Frameworks
- 🧠 Jailbreak & Prompt Injection
- 🔍 Detection & Defense Tools
- 📈 Benchmarks & Datasets
- 📜 Papers & Research
- 🧩 Related Projects & Standards
- 🤝 Contributing
- 📄 License
- DeepTeam – Modular LLM red teaming framework (prompt injection, hallucination, data leaks, jailbreaks).
- garak – LLM vulnerability scanner covering toxicity, bias, sensitive data leaks, etc.
- LLMFuzzer – Adversarial fuzzing toolkit for generating harmful prompts automatically.
- Awesome-Jailbreak-on-LLMs – Collection of jailbreak techniques, datasets, and defenses.
- JailbreakingLLMs (PAIR) – Black-box jailbreak generation via automatic prompt refinement.
- jailbreak_llms – Real-world prompt jailbreak dataset (15k+ examples).
- llm-warden – Hugging Face-based jailbreak detection model.
- vigil-llm – REST API for LLM security risk scoring.
- last_layer – Low-latency pre-filter for prompt injection prevention.
- JailbreakBench – Benchmark suite for evaluating jailbreak resilience and defense effectiveness.
- LLM Red Teaming Dataset – Google’s multi-domain adversarial testing prompts and tasks.
- AdvBench – Benchmark focused on adversarial robustness and safety of LLMs.
- UDora – Unified red teaming for LLM agents via reasoning hijacks (arXiv)
- PrivAgent – Privacy attack simulation through agentic LLMs (arXiv)
- AutoDefense – Multi-agent automated jailbreak defense (arXiv)
- GuardrailsAI – Add structured validation and policy enforcement for LLMs.
- Rebuff – Defense wrapper for LLMs, stops adversarial instructions.
- OWASP Top 10 for LLMs (2024) – Official list of key LLM risks including prompt injection.
Want to add a tool, dataset, or paper? 💡
- Fork this repo
- Add your entry under the right category in
README.md - Submit a pull request!
See CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License.
If you find this list helpful:
- Give us a ⭐️ on GitHub
- Share it with fellow researchers and builders
- Submit issues/PRs to keep it fresh!
Maintained by the community. Inspired by awesome-mlops, awesome-appsec, and awesome-ai-security.