This repo sets up a testing environment for Garak, a vulnerability scanner for LLMs, with support for running tests against hosted models, or models that have a REST APIs. It uses OpenWebUI for a front end to Ollama which is a tool that allows users to run large language models (LLMs) locally on their own computers.
├── docker-compose.yml # Spins up the full test environment (Garak + model)
├── garak/ # All things Garak, including Dockerfile and config
│ ├── Dockerfile.garak # Dockerfile for building the Garak image
│ ├── local_images # Pre-built Docker images for different OS platforms
│ ├── ollama_generator/ # Config for using Garak with Ollama
│ │ └── ollama_options.json # JSON config file pointing Garak to the Ollama host
│ └── rest_generator/ # Config for using Garak with REST-based LLM APIs
│ ├── rest_request.json # REST generator config (used with llama for demo)
│ └── README # Instructions for creating and using the REST generator
├── README.md # You're here! Overview of the project
└── results/ # Output directory for test reports
Install Docker & Docker Compose
- Follow the Docker docs to Get Docker
Adjust Permissions
The results are provided through a bind mount for easy viewing on the host computer. However, the Garak container runs with UID 1001 and will need write access to the results directory. Running the command below will allow any user in any group to read and write to the directory. Only do this for testing.
chmod 777 ./results
Bring up the containers
This can be done using a pre-built image (good for those behind a corporate proxy), or you can let docker compose build it for you.
docker compose build (preferred method)
docker compose up --build -d
To use a pre-built image follow the steps below
# build the image manually.
docker build --no-cache -t garak_testing-garak .
# save it to a tar
docker save -o amd64-garak.tar garak-ai-security-testing-demo-garak
# move it to garak/local_images/garak.tar on the host computer
mv /src/path/linux-garak.tar garak/local_images/linux-garak.tar
# load the image into docker
docker load --input garak/local_images/amd64-garak.tar
# start the containers
docker compose up -d
Set Up OpenWebUI
- Navigate to http://localhost:3000
- Create a testing account
- Log into the testing account for the next steps
Download Model(s) for Testing
The compose file has a helper container that preloads the llama3 model, but if you want to use another one, follow the steps below.
- Click on the test account on the top right
- Select
Admin panel
- Select the
Settings
tab at the top - Choose the
Models
tab on the left - Click the download icon
- In the URI field to pull a model from Ollama.com enter
llama3.2
- If you want another model instead, Ollama has many to choose from (just check the license first)
- Click download
- Start a new chat with the selected model
Alternatively, you can log into the ollama container directly, and pull a model without the use of the OpenWebUI front end.
# get to the ollama container shell
docker exec -it ollama /bin/bash
# download the model
ollama pull llama3.2
# verify it downloaded
ollama list
# run the model
ollama run llama3.2
Corporate
- If you're behind a corporate proxy and/or using WSL you may have to fiddle with some of the commands and config files.
Component | What it does | Why you care |
---|---|---|
Probes | Provide the attacks/payloads | Define what is being tested |
Generators | Connect to the model | Define how you talk to it |
Detectors | Evaluate model responses | Define what counts as failure |
Evaluators | Score performance | Optional, more nuanced metrics |
Reports | Save outputs | Useful for audits & dashboards |
Log into the garak container
docker exec -it garak /bin/bash
Run Some Garak Tests
Pick a generator, choose your probe, and fire it at the model!
ℹ️ Note: LLMs are non-deterministic — they can produce different outputs given the same input. That’s where the
--generations
flag comes in. It tells Garak how many times to run each test. The default is 10 (which is great for thoroughness but kind of a time hog). Somewhere around 3–4 is usually enough to catch interesting stuff for a demo without waiting forever.
Here’s an example command to get you rolling. Happy probing 😈:
garak --model_type ollama \
--model_name llama3.2:latest \
--probes xss \
--generations 3 \
--generator_option_file /app/ollama_options.json \
--report_prefix /app/results/$(date +%Y-%m-%dT%H%M%S) \
--verbose
Viewing the Results
When all probe tests have finished, two reports will be generated. A third will be generated if there are findings. These will be moved to the results directory in the repository directory on the host computer.
📜 report.html: Very high level, bare bones summary
📜 hitlog.jsonl: Contains only the failures
📜 report.jsonl: Contains all the tests run and the result of each.
The JSON Line files can be formatted and read (e.g. the jq command below for grepping), or they can be consumed by other processes.
jq -c '.' /path/to/workspace/garak_demo/results/2025-04-12T082159.hitlog.jsonl | jq
Shut It Down
Once you're done testing, you'll want to stop all the running services.
docker compose down --remove-orphans
Head over to the REST README to learn how to build custom REST requests that Garak can use to run the tests.
You'll need to be logged into the garak container for these to work
## Log into the garak container
docker exec -it garak /bin/bash
## List Loaded Probes.
## The stars 🌟 indicate a whole plugin
garak --list_probes
## List Supported Generators.
garak --list_generators
## Run All Probes (Danger Zone™)
## 🛑 Heads-up: this takes a while and can generate a lot of output.
garak --model_type ollama \
--model_name llama3.2:latest \
--probes all \
--generations 3 \
--generator_option_file /app/ollama_options.json \
--report_prefix /app/results/$(date +%Y-%m-%dT%H%M%S) \
--verbose