hackaprompt, Dreadnode AIRT Agent Example for the Learn Prompting AI CTF challenges - inline with the Dreadnode SDK
An AI Red Teaming framework for testing prompt injection attacks against hackaprompt challenges.
-
Install dependencies:
uv sync
-
Get your session token (CRITICAL - Must include ALL cookies):
⚠️ IMPORTANT: You need the COMPLETE Cookie header, not just the auth tokens!Step-by-step instructions:
- Login to hackaprompt.com
- Make sure you can see the challenges (fully logged in)
- Open DevTools (F12) → Network tab
- Refresh the page or navigate to any challenge
- Click on any request to hackaprompt.com in the Network tab
- In the Request Headers section, find the
Cookie
header - Copy the ENTIRE Cookie header value (very long string)
Required cookies (your Cookie header must include ALL of these):
_ga_*=...
← Google Analytics (required!)_ga=...
← Google Analytics (required!)_fbp=...
← Facebook pixel (required!)sb-iligpfkvyargzgpcrquc-auth-token.0=...
← Auth token part 1sb-iligpfkvyargzgpcrquc-auth-token.1=...
← Auth token part 2ph_phc_*=...
← PostHog analytics
Expected format:
_ga_C1Y5QFNVBM=GS1.1.1752777779.3.1.1752777826.0.0.0; _ga=GA1.1.1961087042.1752612468; _fbp=fb.1.1752612468123.123456789; sb-iligpfkvyargzgpcrquc-auth-token.0=base64-...; sb-iligpfkvyargzgpcrquc-auth-token.1=...; ph_phc_qHP99rFsNJJhkOJRxkiJm4Trf4A3TormWq0g5ShTLfc_posthog=%7B...
-
Set environment variables:
# Required - complete cookie string from browser export HACKAPROMPT_SESSION_TOKEN="_ga_C1Y5QFNVBM=...; _ga=...; _fbp=...; sb-iligpfkvyargzgpcrquc-auth-token.0=base64-...; sb-iligpfkvyargzgpcrquc-auth-token.1=...; ph_phc_qHP99rFsNJJhkOJRxkiJm4Trf4A3TormWq0g5ShTLfc_posthog=..." export ANTHROPIC_API_KEY="your_anthropic_key" # or other AI provider # Optional export GROQ_API_KEY="your_groq_key" export OPENAI_API_KEY="your_openai_key"
-
Get Session ID (Required for API reliability):
⚠️ IMPORTANT: For reliable API calls, you need to intercept a session ID from your browser:- Open any hackaprompt challenge in your browser (make sure you're logged in)
- Open Developer Tools (F12) → Console tab
- Paste and run this code:
const originalFetch = window.fetch; window.fetch = function(...args) { const result = originalFetch.apply(this, args); result.then(response => { if (args[0].includes('/api/chat') && args[1]?.body) { try { const body = JSON.parse(args[1].body); if (body.session_id) { console.log('🎯 SESSION ID:', body.session_id); window.hackaPromptSessionId = body.session_id; } } catch (e) {} } }); return result; }; console.log('✅ Ready! Submit a prompt to capture session ID.');
- Submit any prompt in the challenge (like "hello")
- Copy the session ID from the console output
-
Run challenges:
# RECOMMENDED: Run with intercepted session ID (most reliable) uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics --session-id "your-session-id-here" # Run all challenges with session ID uv run -m hackaprompt.main --mode l groq/moonshotai/kimi-k2-instruct --session-id "your-session-id-here" # Run specific challenge with increased steps uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges instruction_defense --max-steps 500 --session-id "your-session-id-here" # Run with proxy (e.g., Burp Suite) uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics --proxy http://127.0.0.1:8080 --session-id "your-session-id-here" # Run without session ID (may encounter API errors) uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics # List available challenges uv run -m hackaprompt.main --list-challenges
The framework:
- Loads challenges from hackaprompt using your session token
- Generates prompt injection attacks using an AI agent deployed via Dreadnode's awesome library, rigging.
- Executes attacks against real hackaprompt challenges and returns output from the challenge into the rigging chat pipeline
- Validates responses using hackaprompt's judge API
- Reports success when the target phrase "I have been PWNED" is achieved
- Metrics and Scoring verifiers via Dreadnode SDK mechanics for efficient evaluations
Use --list-challenges
to see all available challenges across all tracks.
The framework supports HTTP proxies for traffic interception and analysis:
# Use Caido/Burp Suite (default proxy)
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --proxy http://127.0.0.1:8080
# Use custom proxy
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --proxy http://proxy.example.com:3128
# No proxy (default)
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct
Important: Since the framework runs inside Docker containers, localhost addresses (127.0.0.1
, localhost
) are automatically converted to host.docker.internal
to access services running on the host machine. SSL certificate verification is disabled for proxy connections to support SSL interception tools like Caido or Burp Suite.
This allows security researchers to:
- Intercept and analyze hackaprompt API traffic
- Modify requests/responses for testing
- Monitor prompt injection attempts
- Debug authentication issues
Setup for Caido or Burp Suite:
- Start Caido/Burp Suite and configure it to listen on
127.0.0.1:8080
- Run the framework with
--proxy http://127.0.0.1:8080
- The framework will automatically convert this to
host.docker.internal:8080
for Docker networking
- Docker isolation for secure code execution
- Session management with proper cookie handling
- Real API calls to hackaprompt (no simulation)
- Concurrent execution across multiple challenges
- Comprehensive logging via Dreadnode integration
- Proxy support for traffic analysis and debugging
The framework uses two JSON files for challenge information:
challenges_info.json
- Basic challenge metadata from hackaprompt API (names, URLs, status)hackaprompt/challenge_manifest.json
- Complete challenge data including:- Detailed descriptions with constraints, scoring, and mechanics
- Complete defending model system prompts for each challenge
- Challenge types and verbose explanations
- All context needed for effective prompt injection attacks
The manifest-based approach ensures attack models receive comprehensive challenge context rather than just basic API metadata, enabling more targeted and effective prompt injection strategies.