Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A powerful, community-curated toolkit to attack, evaluate, defend, and monitor Large Language Models (LLMs) — covering everything from prompt injection to jailbreak detection.

Notifications You must be signed in to change notification settings

d1pakda5/awesome-llm-security-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

🧠 Awesome LLM Security Tools Awesome

A curated collection of open-source tools, benchmarks, datasets, and papers focused on the security of Large Language Models (LLMs) — covering prompt injection, jailbreaks, red-teaming, evaluations, and defenses.

🛡️ Stay secure. Test everything. Contribute often.



📚 Table of Contents


🛡️ Red Teaming & Attack Frameworks

  • DeepTeam – Modular LLM red teaming framework (prompt injection, hallucination, data leaks, jailbreaks).
  • garak – LLM vulnerability scanner covering toxicity, bias, sensitive data leaks, etc.
  • LLMFuzzer – Adversarial fuzzing toolkit for generating harmful prompts automatically.

🧠 Jailbreak & Prompt Injection


🔍 Detection & Defense Tools

  • llm-warden – Hugging Face-based jailbreak detection model.
  • vigil-llm – REST API for LLM security risk scoring.
  • last_layer – Low-latency pre-filter for prompt injection prevention.

📈 Benchmarks & Datasets

  • JailbreakBench – Benchmark suite for evaluating jailbreak resilience and defense effectiveness.
  • LLM Red Teaming Dataset – Google’s multi-domain adversarial testing prompts and tasks.
  • AdvBench – Benchmark focused on adversarial robustness and safety of LLMs.

📜 Papers & Research

  • UDora – Unified red teaming for LLM agents via reasoning hijacks (arXiv)
  • PrivAgent – Privacy attack simulation through agentic LLMs (arXiv)
  • AutoDefense – Multi-agent automated jailbreak defense (arXiv)

🧩 Related Projects & Standards

  • GuardrailsAI – Add structured validation and policy enforcement for LLMs.
  • Rebuff – Defense wrapper for LLMs, stops adversarial instructions.
  • OWASP Top 10 for LLMs (2024) – Official list of key LLM risks including prompt injection.

🤝 Contributing

Want to add a tool, dataset, or paper? 💡

  • Fork this repo
  • Add your entry under the right category in README.md
  • Submit a pull request!

See CONTRIBUTING.md for detailed guidelines.


📄 License

This project is licensed under the MIT License.


🌟 Show Your Support

If you find this list helpful:

  • Give us a ⭐️ on GitHub
  • Share it with fellow researchers and builders
  • Submit issues/PRs to keep it fresh!

Maintained by the community. Inspired by awesome-mlops, awesome-appsec, and awesome-ai-security.

About

A powerful, community-curated toolkit to attack, evaluate, defend, and monitor Large Language Models (LLMs) — covering everything from prompt injection to jailbreak detection.

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published