SpecTater

Check if a document meets the requirements specified in another document using AI-powered analysis.

Overview

SpecTater is a Flask web application that uses AWS Bedrock (Claude or Amazon Nova models) to automatically evaluate whether submission documents (like Certificates of Insurance) meet requirements specified in policy documents. The application returns detailed, structured JSON results with pass/fail analysis for each requirement.

Key Features

Intelligent Document Analysis - Uses LLM to compare submissions against policy requirements
Multiple Format Support - Accepts PDF, TXT, and Markdown files
AWS Textract Integration - Automatically processes PDFs with tables using OCR
Structured JSON Output - Returns detailed pass/fail for each requirement
Session-Based Caching - Reuses policy documents across submissions
IP Subnet Restriction - Optional access control by IP range
CLI Testing Tools - Fast iteration with markdown files, wildcard support
Customizable UI - Configure title, introduction, and static requirements

Quick Start

Installation

# Clone repository
git clone https://github.com/dirkpetersen/spectater.git
cd spectater

# Install dependencies
pip install -r requirements.txt

# Configure environment (copy and edit)
cp .env.default .env
# Edit .env to set your AWS credentials and MODEL_ID

Configuration

Key settings in .env:

# AWS Bedrock Model (use inference profile for Sonnet 4)
MODEL_ID='us.anthropic.claude-sonnet-4-5-20250929-v1:0'

# Application Settings
FLASK_DEBUG=True
FLASK_PORT=5000

# Optional: IP Subnet Restriction
# SUBNET_ONLY='192.168.0.0/16'  # Allow only this subnet
# SUBNET_ONLY='127.0.0.1/32'    # Localhost only (default)

# Optional: UI Customization
# TITLE='My Custom Title'
# INTRODUCTION='intro.txt'       # Show intro text from file
# REQUIREMENTS='requirements.md'  # Use static requirements (hides upload)

Run the Application

python app.py
# Access at http://localhost:5000

Usage

Web Interface

Upload Policy Document (PDF, TXT, or MD)
- Contains the requirements to check against
- Cached per session for reuse
- Or use REQUIREMENTS=file.md in .env for static requirements
Upload Submission Document (PDF or MD)
- The document to be evaluated
- PDFs with tables automatically use Textract OCR
- Auto-submits and shows progress messages
View Results
- Status Box: GREEN (pass) or RED (fail)
- Detailed Table: Pass/fail for each requirement with color coding
- Raw JSON: Complete structured output with model ID

CLI Testing Tool

Fast iteration without web browser:

# Basic usage
./testeval.py --spec requirements.md --submit document.md

# With debug output (shows JSON parsing, model info)
./testeval.py --spec requirements.md --submit document.md --debug

# Batch testing with wildcards (runs separate evaluation for each)
./testeval.py --spec requirements.md --submit 'cache/*/submission_*.md' --debug

PDF Conversion Tool

# Convert PDF to markdown
./pdf2md.py certificate.pdf certificate.md

# Auto-generate output filename
./pdf2md.py certificate.pdf

Response Format

The LLM returns structured JSON:

{
  "summary": {
    "statement": "Overall compliance assessment",
    "totalChecks": 10,
    "passed": 8,
    "failed": 2
  },
  "requirements": [
    {
      "requirement": "General Liability Limit",
      "policyRequirement": "$2,000,000 per occurrence",
      "submissionValue": "$1,000,000 per occurrence",
      "pass": false,
      "notes": "Does not meet minimum requirement"
    }
  ]
}

Status Determination:

Automatically checks all pass fields in requirements array
RED if any requirement has pass: false
GREEN if all requirements have pass: true
Validates summary counts match actual requirements (shows warning if inconsistent)

Advanced Features

AWS Textract for Complex PDFs

When a PDF contains tables, the app automatically:

Creates temporary S3 bucket (if needed)
Uploads PDF to S3
Runs Textract OCR analysis
Converts results to markdown
Cleans up S3 objects
Deletes bucket on app shutdown

Session & Caching

Session-based: Flask sessions (not cookies) track users
Cache Structure: cache/{session_id}/policy.md and cache/{session_id}/submission_{hash}.md
Debug Mode: Submission documents cached only when FLASK_DEBUG=True
Hash-based: Submission cache uses filename hash for consistent lookups

Dynamic Token Allocation

Automatically calculates max_tokens as 2/3 of input bytes
Minimum 5000 tokens to prevent JSON truncation
Uses MAX_TOKENS from .env as floor if configured
Extended 1M token context for Claude Sonnet 4 (when tokens > 200,000)

IP Subnet Restriction

Restrict access by IP address:

SUBNET_ONLY='10.10.0.0/16'     # Allow 10.10.x.x
SUBNET_ONLY='192.168.1.0/24'   # Allow 192.168.1.x
SUBNET_ONLY='127.0.0.1/32'     # Localhost only (default)
SUBNET_ONLY='0.0.0.0/0'        # Allow all (no restriction)

Unauthorized users see: "You are not authorized to access this application from your location. You may need to connect via VPN with split tunneling DISABLED."

UI Customization

Create a customized experience:

# Custom title
TITLE='OSU Insurance Compliance Checker'

# Show introduction text
INTRODUCTION='welcome.txt'

# Use static requirements (hides policy upload field)
REQUIREMENTS='osu-requirements.md'

Architecture Details

File Processing

Policy Files: PDF, TXT, MD → Always cached
Submission Files: PDF (with Textract for tables), MD → Cached in debug mode
Character Limiting: Applied after markdown conversion (configurable)

LLM Configuration

Temperature: Hardcoded to 0 (deterministic output)
Cache Control: Disabled (cache_control: {"type": "ephemeral"})
Prompt Template: analysis-prompt.md (uses string replacement, not .format())

Progress Feedback

When processing documents, users see rotating status messages:

Uploading documents...
Reading PDF files...
Detecting tables and complex elements...
Processing tables with OCR if needed...
Analyzing submission against requirements...
(14 messages total, changing every 3 seconds)

Troubleshooting

JSON Parse Errors

If you see "Expecting ',' delimiter" errors:

Increase MAX_TOKENS in .env (or remove to use dynamic calculation)
Try a model with higher output capacity (Sonnet vs Haiku)
Check debug output to see where JSON truncates

Model Access Issues

Claude Sonnet 4 requires inference profile ARN:

# Use this (with inference profile)
MODEL_ID='us.anthropic.claude-sonnet-4-5-20250929-v1:0'

# Not this (direct model ID won't work)
MODEL_ID='anthropic.claude-sonnet-4-5-20250929-v1:0'

Inconsistent Results Warning

If summary counts don't match requirements:

Review analysis-prompt.md for clarity
Check if policy document has ambiguous requirements
Enable debug mode to see actual vs summary counts

Development

Debug Mode Benefits

When FLASK_DEBUG=True:

Submission documents cached for faster re-testing
Console shows LLM response preview
JSON parsing details printed
Summary count validation logged

Testing Strategy

Use testeval.py with markdown files for rapid prompt iteration
Use wildcard patterns to batch-test multiple submissions
Enable --debug flag to see complete JSON parsing flow
Check cache files in cache/{session_id}/ to verify extraction

License

MIT License - see LICENSE file

Requirements

Python 3.12+
AWS account with Bedrock access
AWS credentials configured (via ~/.aws/credentials or environment variables)
For Textract: S3 and Textract permissions

AWS Permissions

The application requires the following AWS IAM permissions:

Minimum Permissions (Basic Operation)

For basic PDF analysis without Textract:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": [
        "arn:aws:bedrock:*::foundation-model/anthropic.claude-*",
        "arn:aws:bedrock:*::foundation-model/amazon.nova-*"
      ]
    }
  ]
}

Full Permissions (With Textract Support)

For complete functionality including table extraction from PDFs:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": [
        "arn:aws:bedrock:*::foundation-model/anthropic.claude-*",
        "arn:aws:bedrock:*::foundation-model/amazon.nova-*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:CreateBucket",
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:DeleteBucket",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::spectater-textract-*",
        "arn:aws:s3:::spectater-textract-*/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "textract:AnalyzeDocument",
        "textract:GetDocumentAnalysis"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "sts:GetCallerIdentity"
      ],
      "Resource": "*"
    }
  ]
}

Note: The S3 bucket is created automatically with the naming pattern {APP_NAME}-textract-{account_id}-{region} and is cleaned up when the application shuts down.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.github/workflows		.github/workflows
templates		templates
.env.default		.env.default
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DEPLOYMENT_CHECKLIST.md		DEPLOYMENT_CHECKLIST.md
Dockerfile		Dockerfile
GUIDE_IMPLEMENTATION_SUMMARY.md		GUIDE_IMPLEMENTATION_SUMMARY.md
HOW_TO_WRITE_REQUIREMENTS.md		HOW_TO_WRITE_REQUIREMENTS.md
LICENSE		LICENSE
README.md		README.md
analysis-prompt.md		analysis-prompt.md
app.py		app.py
evaluation-rules.default.json		evaluation-rules.default.json
install-service.sh		install-service.sh
osu-requirements-template.md		osu-requirements-template.md
pdf2md.py		pdf2md.py
requirements.txt		requirements.txt
testeval.py		testeval.py
textract_to_markdown.py		textract_to_markdown.py

License

dirkpetersen/spectater

Folders and files

Latest commit

History

Repository files navigation

SpecTater

Overview

Key Features

Quick Start

Installation

Configuration

Run the Application

Usage

Web Interface

CLI Testing Tool

PDF Conversion Tool

Response Format

Advanced Features

AWS Textract for Complex PDFs

Session & Caching

Dynamic Token Allocation

IP Subnet Restriction

UI Customization

Architecture Details

File Processing

LLM Configuration

Progress Feedback

Troubleshooting

JSON Parse Errors

Model Access Issues

Inconsistent Results Warning

Development

Debug Mode Benefits

Testing Strategy

License

Requirements

AWS Permissions

Minimum Permissions (Basic Operation)

Full Permissions (With Textract Support)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages