Thanks to visit codestin.com
Credit goes to github.com

Skip to content

GangGreenTemperTatum/hackaprompt

Repository files navigation

hackaprompt AI Red Teaming Framework

hackaprompt, Dreadnode AIRT Agent Example for the Learn Prompting AI CTF challenges - inline with the Dreadnode SDK

GitHub forks GitHub issues GitHub release (latest by date) GitHub stars License

Report BugRequest Feature


An AI Red Teaming framework for testing prompt injection attacks against hackaprompt challenges.

Quick Start

  1. Install dependencies:

    uv sync
  2. Get your session token (CRITICAL - Must include ALL cookies):

    ⚠️ IMPORTANT: You need the COMPLETE Cookie header, not just the auth tokens!

    Step-by-step instructions:

    1. Login to hackaprompt.com
    2. Make sure you can see the challenges (fully logged in)
    3. Open DevTools (F12) → Network tab
    4. Refresh the page or navigate to any challenge
    5. Click on any request to hackaprompt.com in the Network tab
    6. In the Request Headers section, find the Cookie header
    7. Copy the ENTIRE Cookie header value (very long string)

    Required cookies (your Cookie header must include ALL of these):

    • _ga_*=... ← Google Analytics (required!)
    • _ga=... ← Google Analytics (required!)
    • _fbp=... ← Facebook pixel (required!)
    • sb-iligpfkvyargzgpcrquc-auth-token.0=... ← Auth token part 1
    • sb-iligpfkvyargzgpcrquc-auth-token.1=... ← Auth token part 2
    • ph_phc_*=... ← PostHog analytics

    Expected format:

    _ga_C1Y5QFNVBM=GS1.1.1752777779.3.1.1752777826.0.0.0; _ga=GA1.1.1961087042.1752612468; _fbp=fb.1.1752612468123.123456789; sb-iligpfkvyargzgpcrquc-auth-token.0=base64-...; sb-iligpfkvyargzgpcrquc-auth-token.1=...; ph_phc_qHP99rFsNJJhkOJRxkiJm4Trf4A3TormWq0g5ShTLfc_posthog=%7B...
    
  3. Set environment variables:

    # Required - complete cookie string from browser
    export HACKAPROMPT_SESSION_TOKEN="_ga_C1Y5QFNVBM=...; _ga=...; _fbp=...; sb-iligpfkvyargzgpcrquc-auth-token.0=base64-...; sb-iligpfkvyargzgpcrquc-auth-token.1=...; ph_phc_qHP99rFsNJJhkOJRxkiJm4Trf4A3TormWq0g5ShTLfc_posthog=..."
    export ANTHROPIC_API_KEY="your_anthropic_key"  # or other AI provider
    
    # Optional
    export GROQ_API_KEY="your_groq_key"
    export OPENAI_API_KEY="your_openai_key"
  4. Get Session ID (Required for API reliability):

    ⚠️ IMPORTANT: For reliable API calls, you need to intercept a session ID from your browser:

    1. Open any hackaprompt challenge in your browser (make sure you're logged in)
    2. Open Developer Tools (F12) → Console tab
    3. Paste and run this code:
      const originalFetch = window.fetch;
      window.fetch = function(...args) {
          const result = originalFetch.apply(this, args);
          result.then(response => {
              if (args[0].includes('/api/chat') && args[1]?.body) {
                  try {
                      const body = JSON.parse(args[1].body);
                      if (body.session_id) {
                          console.log('🎯 SESSION ID:', body.session_id);
                          window.hackaPromptSessionId = body.session_id;
                      }
                  } catch (e) {}
              }
          });
          return result;
      };
      console.log('✅ Ready! Submit a prompt to capture session ID.');
    4. Submit any prompt in the challenge (like "hello")
    5. Copy the session ID from the console output
  5. Run challenges:

    # RECOMMENDED: Run with intercepted session ID (most reliable)
    uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics --session-id "your-session-id-here"
    
    # Run all challenges with session ID
    uv run -m hackaprompt.main --mode
    l groq/moonshotai/kimi-k2-instruct --session-id "your-session-id-here"
    
    # Run specific challenge with increased steps
    uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges instruction_defense --max-steps 500 --session-id "your-session-id-here"
    
    # Run with proxy (e.g., Burp Suite)
    uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics --proxy http://127.0.0.1:8080 --session-id "your-session-id-here"
    
    # Run without session ID (may encounter API errors)
    uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --challenges basics
    
    # List available challenges
    uv run -m hackaprompt.main --list-challenges

How It Works

The framework:

  1. Loads challenges from hackaprompt using your session token
  2. Generates prompt injection attacks using an AI agent deployed via Dreadnode's awesome library, rigging.
  3. Executes attacks against real hackaprompt challenges and returns output from the challenge into the rigging chat pipeline
  4. Validates responses using hackaprompt's judge API
  5. Reports success when the target phrase "I have been PWNED" is achieved
  6. Metrics and Scoring verifiers via Dreadnode SDK mechanics for efficient evaluations

Available Challenges

Use --list-challenges to see all available challenges across all tracks.

Proxy Support

The framework supports HTTP proxies for traffic interception and analysis:

# Use Caido/Burp Suite (default proxy)
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --proxy http://127.0.0.1:8080

# Use custom proxy
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct --proxy http://proxy.example.com:3128

# No proxy (default)
uv run -m hackaprompt.main --model groq/moonshotai/kimi-k2-instruct

Important: Since the framework runs inside Docker containers, localhost addresses (127.0.0.1, localhost) are automatically converted to host.docker.internal to access services running on the host machine. SSL certificate verification is disabled for proxy connections to support SSL interception tools like Caido or Burp Suite.

This allows security researchers to:

  • Intercept and analyze hackaprompt API traffic
  • Modify requests/responses for testing
  • Monitor prompt injection attempts
  • Debug authentication issues

Setup for Caido or Burp Suite:

  1. Start Caido/Burp Suite and configure it to listen on 127.0.0.1:8080
  2. Run the framework with --proxy http://127.0.0.1:8080
  3. The framework will automatically convert this to host.docker.internal:8080 for Docker networking

Architecture

  • Docker isolation for secure code execution
  • Session management with proper cookie handling
  • Real API calls to hackaprompt (no simulation)
  • Concurrent execution across multiple challenges
  • Comprehensive logging via Dreadnode integration
  • Proxy support for traffic analysis and debugging

Challenge Data Structure

The framework uses two JSON files for challenge information:

  • challenges_info.json - Basic challenge metadata from hackaprompt API (names, URLs, status)
  • hackaprompt/challenge_manifest.json - Complete challenge data including:
    • Detailed descriptions with constraints, scoring, and mechanics
    • Complete defending model system prompts for each challenge
    • Challenge types and verbose explanations
    • All context needed for effective prompt injection attacks

The manifest-based approach ensures attack models receive comprehensive challenge context rather than just basic API metadata, enabling more targeted and effective prompt injection strategies.

Star History

Star History Chart