🧪 Garak AI - LLM Security Testing

This repo sets up a testing environment for Garak, a vulnerability scanner for LLMs, with support for running tests against hosted models, or models that have a REST APIs. It uses OpenWebUI for a front end to Ollama which is a tool that allows users to run large language models (LLMs) locally on their own computers.

🚀 File & Directory Structure

├── docker-compose.yml             # Spins up the full test environment (Garak + model)
├── garak/                         # All things Garak, including Dockerfile and config
│   ├── Dockerfile.garak           # Dockerfile for building the Garak image
│   ├── local_images               # Pre-built Docker images for different OS platforms
│   ├── ollama_generator/          # Config for using Garak with Ollama
│   │   └── ollama_options.json    # JSON config file pointing Garak to the Ollama host
│   └── rest_generator/            # Config for using Garak with REST-based LLM APIs
│       ├── rest_request.json      # REST generator config (used with llama for demo)
│       └── README                 # Instructions for creating and using the REST generator
├── README.md                      # You're here! Overview of the project
└── results/                       # Output directory for test reports

Prereqs to Testing

Install Docker & Docker Compose

Follow the Docker docs to Get Docker

Adjust Permissions

The results are provided through a bind mount for easy viewing on the host computer. However, the Garak container runs with UID 1001 and will need write access to the results directory. Running the command below will allow any user in any group to read and write to the directory. Only do this for testing.

chmod 777 ./results

Bring up the containers

This can be done using a pre-built image (good for those behind a corporate proxy), or you can let docker compose build it for you.

docker compose build (preferred method)

docker compose up --build -d

To use a pre-built image follow the steps below

# build the image manually.
docker build --no-cache -t garak_testing-garak .

# save it to a tar
docker save -o amd64-garak.tar garak-ai-security-testing-demo-garak

# move it to garak/local_images/garak.tar on the host computer
mv /src/path/linux-garak.tar garak/local_images/linux-garak.tar 

# load the image into docker
docker load --input garak/local_images/amd64-garak.tar

# start the containers
docker compose up -d

Set Up OpenWebUI

Navigate to http://localhost:3000
Create a testing account
Log into the testing account for the next steps

Download Model(s) for Testing

The compose file has a helper container that preloads the llama3 model, but if you want to use another one, follow the steps below.

Click on the test account on the top right
Select Admin panel
Select the Settings tab at the top
Choose the Models tab on the left
Click the download icon
In the URI field to pull a model from Ollama.com enter llama3.2
- If you want another model instead, Ollama has many to choose from (just check the license first)
Click download
Start a new chat with the selected model

Alternatively, you can log into the ollama container directly, and pull a model without the use of the OpenWebUI front end.

# get to the ollama container shell
docker exec -it ollama /bin/bash

# download the model
ollama pull llama3.2

# verify it downloaded
ollama list

# run the model
ollama run llama3.2

Corporate

If you're behind a corporate proxy and/or using WSL you may have to fiddle with some of the commands and config files.

What You Need to Know to Use Garak Effectively

Component	What it does	Why you care
Probes	Provide the attacks/payloads	Define what is being tested
Generators	Connect to the model	Define how you talk to it
Detectors	Evaluate model responses	Define what counts as failure
Evaluators	Score performance	Optional, more nuanced metrics
Reports	Save outputs	Useful for audits & dashboards

How to Run Garak Probes Against Your Installed LLM

Log into the garak container

docker exec -it garak /bin/bash

Run Some Garak Tests

Pick a generator, choose your probe, and fire it at the model!

ℹ️ Note: LLMs are non-deterministic — they can produce different outputs given the same input. That’s where the --generations flag comes in. It tells Garak how many times to run each test. The default is 10 (which is great for thoroughness but kind of a time hog). Somewhere around 3–4 is usually enough to catch interesting stuff for a demo without waiting forever.

Here’s an example command to get you rolling. Happy probing 😈:

garak --model_type ollama \
      --model_name llama3.2:latest \
      --probes xss \
      --generations 3 \
      --generator_option_file /app/ollama_options.json \
      --report_prefix /app/results/$(date +%Y-%m-%dT%H%M%S) \
      --verbose

Viewing the Results

When all probe tests have finished, two reports will be generated. A third will be generated if there are findings. These will be moved to the results directory in the repository directory on the host computer.

📜 report.html: Very high level, bare bones summary

📜 hitlog.jsonl: Contains only the failures

📜 report.jsonl: Contains all the tests run and the result of each.

The JSON Line files can be formatted and read (e.g. the jq command below for grepping), or they can be consumed by other processes.

jq -c '.' /path/to/workspace/garak_demo/results/2025-04-12T082159.hitlog.jsonl | jq

Shut It Down

Once you're done testing, you'll want to stop all the running services.

docker compose down --remove-orphans

Testing REST Services That Don't Have Native Support

Head over to the REST README to learn how to build custom REST requests that Garak can use to run the tests.

Common Garak Commands

You'll need to be logged into the garak container for these to work

## Log into the garak container
docker exec -it garak /bin/bash

## List Loaded Probes.
## The stars 🌟 indicate a whole plugin
garak --list_probes

## List Supported Generators.
garak --list_generators

## Run All Probes (Danger Zone™)
## 🛑 Heads-up: this takes a while and can generate a lot of output.
garak --model_type ollama \
      --model_name llama3.2:latest \
      --probes all \
      --generations 3 \
      --generator_option_file /app/ollama_options.json \
      --report_prefix /app/results/$(date +%Y-%m-%dT%H%M%S) \
      --verbose

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧪 Garak AI - LLM Security Testing

🚀 File & Directory Structure

Prereqs to Testing

What You Need to Know to Use Garak Effectively

How to Run Garak Probes Against Your Installed LLM

Testing REST Services That Don't Have Native Support

Common Garak Commands

About

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
garak		garak
results		results
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

280Zo/garak-ai-demo

Folders and files

Latest commit

History

Repository files navigation

🧪 Garak AI - LLM Security Testing

🚀 File & Directory Structure

Prereqs to Testing

What You Need to Know to Use Garak Effectively

How to Run Garak Probes Against Your Installed LLM

Testing REST Services That Don't Have Native Support

Common Garak Commands

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks