Thanks to visit codestin.com
Credit goes to GitHub.com

Skip to content

CHA0S-CORP/PEFA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ PEFA β€” Phishing Email Forensic Analyzer

PyPI Python

A Python CLI tool that converts .eml files into cyber-infographic PNGs and interactive HTML reports. PEFA performs automated forensic analysis of phishing indicators and produces a composite threat score (0–100) backed by multiple detection engines and optional threat intelligence APIs.

Sample PEFA report β€” WELLS FARGO BANK phishing analysis

☝️ Click to see full report Β· πŸ”— Interactive HTML version Β· All sample reports

πŸ“‚ More sample reports (17 total)
Report PNG HTML
ATTENTION DEAR πŸ–ΌοΈ PNG 🌐 HTML
Congratulations Dear πŸ–ΌοΈ PNG 🌐 HTML
Dear Friend πŸ–ΌοΈ PNG 🌐 HTML
Dear Winner πŸ–ΌοΈ PNG 🌐 HTML
File πŸ–ΌοΈ PNG 🌐 HTML
Greetings to you πŸ–ΌοΈ PNG 🌐 HTML
HAPPY NEW YEAR! πŸ–ΌοΈ PNG 🌐 HTML
Konto-ÜberprΓΌefig (Swiss German) πŸ–ΌοΈ PNG 🌐 HTML
Online Bank Of Africa πŸ–ΌοΈ PNG 🌐 HTML
Please I Need Your Urgent Attention πŸ–ΌοΈ PNG 🌐 HTML
INSTRUCTION TO CREDIT YOUR ACCOUNT ($25M) πŸ–ΌοΈ PNG 🌐 HTML
THIS IS YOUR ATM VISA CARD πŸ–ΌοΈ PNG 🌐 HTML
Text or Call +1 225 463 0148 πŸ–ΌοΈ PNG 🌐 HTML
URGENT RESPONSE πŸ–ΌοΈ PNG 🌐 HTML
Votre colis est prΓͺt pour la livraison πŸ–ΌοΈ PNG 🌐 HTML
Your Funds Update! πŸ–ΌοΈ PNG 🌐 HTML
original_msg πŸ–ΌοΈ PNG 🌐 HTML

✨ Features

  • 🎯 Threat Scoring β€” Weighted 0–100 composite score across 7 categories with 5 severity levels (Clean / Low / Medium / High / Critical)
  • πŸ”— Link Analysis β€” HREF mismatches, brand lookalikes, homoglyph domains, IP-based URLs, URL shorteners, suspicious TLDs, JavaScript/data URIs
  • πŸ‘€ Sender Spoofing Detection β€” Display name spoofing, Return-Path/Reply-To mismatches, domain impersonation, homoglyph characters
  • ⚑ Urgency Language Scanning β€” 24 social-engineering pressure patterns, generic greeting detection, keyword density scoring
  • πŸ“Ž Attachment Threat Assessment β€” 40+ dangerous extensions, macro-enabled documents, double extensions, MIME mismatches, file hashing (MD5/SHA256)
  • πŸ” Authentication Checks β€” SPF, DKIM, and DMARC validation from headers (with optional MXToolbox deep validation)
  • πŸ›€οΈ Delivery Path Tracing β€” Full email hop trace with IP geolocation per relay
  • πŸ“… Domain Age Lookup β€” WHOIS-based registration date and age risk assessment
  • πŸ”€ Language Quality Analysis β€” Mixed-script detection, entropy analysis, zero-width characters, irregular spacing
  • 🧬 IOC Extraction β€” Consolidated Indicators of Compromise (IPs, domains, URLs, emails, hashes) with optional enrichment
  • πŸ€– AI Assessment β€” Optional Google Gemini analysis with verdict, confidence score, attack classification, and recommended actions
  • πŸ“Š Interactive HTML Reports β€” Collapsible sections, scroll-spy navigation, copy-to-clipboard, animated threat gauge, tooltips
  • πŸ“ Batch Processing β€” Analyze entire directories of .eml files with a single command
  • 🌐 Web UI β€” Browser-based upload interface with live analysis (no Playwright needed client-side)

πŸ“¦ Installation

pip install pefa
playwright install chromium

Or install from source:

pip install .
playwright install chromium

Requires Python 3.10+ Β· PyPI page

πŸš€ Quick Start

# Analyze a single email β†’ PNG infographic
pefa input.eml

# Also generate an interactive HTML report
pefa input.eml --html

# Include Gemini AI assessment
pefa input.eml --gemini

# Skip all external API calls (fully offline)
pefa input.eml --no-api

# Batch process a directory
pefa ./emails/ -o ./reports/

# Launch the web UI
pefa --web --port 8080

Or run as a module:

python3 -m pefa input.eml

πŸ“– Usage Examples

Analyze a single email

pefa suspicious-email.eml

This produces suspicious-email.png in the current directory β€” a full-page infographic with threat score, sender analysis, link flags, authentication results, and the rendered email body.

Generate both PNG and interactive HTML

pefa suspicious-email.eml --html

Outputs two files: suspicious-email.png and suspicious-email.html. The HTML report includes collapsible sections, scroll-spy navigation, an animated threat gauge, copy-to-clipboard for IOCs, and print/download buttons.

Save output to a specific location

pefa suspicious-email.eml -o ./reports/case-42.png
pefa suspicious-email.eml -o ./reports/case-42.png --html

The -o flag sets the output path. When combined with --html, the HTML file is placed alongside the PNG.

Batch process a folder of emails

pefa ./inbox/ -o ./reports/

Analyzes every .eml file in ./inbox/ and writes reports to ./reports/. A single Playwright browser instance is reused across all files for faster processing. If -o is omitted, reports go to ./inbox/reports/.

Include AI-powered assessment

export GEMINI_API_KEY="your-key-here"
pefa suspicious-email.eml --gemini

Adds a Gemini AI section to the report with a verdict (phishing/legitimate/suspicious), confidence score, attack type classification, and recommended actions. The AI assessment can also influence the overall threat score (+25 or +50 points).

To use a different Gemini model:

pefa suspicious-email.eml --gemini --gemini-model gemini-2.5-pro

Run fully offline (no API calls)

pefa suspicious-email.eml --no-api

Skips all external lookups (IP geolocation, WHOIS, urlscan, VirusTotal, AbuseIPDB, AlienVault, MXToolbox). The analysis still runs SPF/DKIM/DMARC checks from headers, link analysis, urgency detection, and attachment scanning β€” all locally.

Customize image dimensions

pefa suspicious-email.eml --width 1400 --scale 2

--width sets the viewport width in pixels (default: 1000). --scale sets the device scale factor (default: 1.5) β€” higher values produce sharper images at larger file sizes.

Launch the web UI

pefa --web

Opens a browser-based drag-and-drop interface at http://localhost:8080. Upload .eml files and view interactive HTML reports directly β€” no Playwright needed on the client side.

pefa --web --port 9090
pefa --web --no-api
pefa --web --gemini

The web UI respects --no-api and --gemini flags.

Combine multiple flags

# Full analysis with AI, HTML output, and high-res image
pefa suspicious-email.eml --html --gemini --width 1200 --scale 2

# Batch process offline with HTML reports
pefa ./inbox/ -o ./reports/ --html --no-api

# Run the sample emails included in the repo
pefa samples/ -o examples/ --html

Use with threat intel API keys

Set any combination of API keys to enrich reports with external intelligence:

export GEMINI_API_KEY="..."       # AI assessment
export URLSCAN_API_KEY="..."      # Domain reputation
export VT_API_KEY="..."           # VirusTotal IOC reputation
export ABUSEIPDB_API_KEY="..."    # IP abuse reports
export OTX_API_KEY="..."          # AlienVault OTX threat intel
export MXTOOLBOX_API_KEY="..."    # Deep email auth validation

pefa suspicious-email.eml --html --gemini

Each integration activates independently β€” you don't need all keys. Missing keys are silently skipped.

βš™οΈ CLI Reference

usage: pefa [-h] [--web] [--port PORT] [-o OUTPUT] [--width WIDTH]
            [--scale SCALE] [--html] [--gemini]
            [--gemini-model MODEL] [--no-api]
            [input]

positional arguments:
  input                 .eml file or directory of .eml files

options:
  -o, --output          Output path for generated reports
  --web                 Start browser-based web UI
  --port                Web server port (default: 8080)
  --width               Viewport width in pixels (default: 1000)
  --scale               Device scale factor (default: 1.5)
  --html                Emit interactive HTML report alongside PNG
  --gemini              Include Gemini AI assessment
  --gemini-model        Gemini model to use (default: gemini-2.5-flash)
  --no-api              Skip all external API lookups

🎯 Threat Scoring

PEFA calculates a composite threat score from 0 to 100 using weighted categories:

Category Max Points What It Measures
πŸ” Authentication 20 SPF, DKIM, DMARC failures
πŸ‘€ Sender 20 Spoofing, homoglyphs, header mismatches
πŸ”— Links 25 HREF mismatches, brand lookalikes, IP URLs, shorteners
⚑ Urgency 15 Pressure language patterns, generic greetings
πŸ“Ž Attachments 10 Dangerous extensions, macros, double extensions
πŸ”€ Language 5 Mixed scripts, entropy anomalies, quality issues
πŸ“… Domain Age 10 Newly registered or young domains

Passing all authentication checks and having an established domain (3+ years) applies negative scoring. Gemini AI verdicts can add up to +50 additional points.

Threat Levels:

Level Score
πŸ”΄ Critical 70–100
🟠 High 45–69
🟑 Medium 25–44
🟒 Low 10–24
βšͺ Clean 0–9

πŸ”Œ API Integrations

All API integrations are optional. PEFA works fully offline with --no-api. Each integration checks for its own environment variable and silently skips if unavailable. No API key is required to run a basic analysis β€” PEFA performs link analysis, urgency detection, sender spoofing checks, attachment scanning, authentication header parsing, and threat scoring entirely locally.

Overview

Service Environment Variable Free? What It Adds to Reports
πŸ€– Google Gemini GEMINI_API_KEY Free tier available AI verdict, attack classification, recommended actions
πŸ” urlscan.io URLSCAN_API_KEY Free tier available URL/domain reputation verdicts
πŸ“§ MXToolbox MXTOOLBOX_API_KEY Paid Deep SPF/DKIM/DMARC validation against live DNS
🦠 VirusTotal VT_API_KEY Free tier available IOC reputation (IPs, domains, URLs, file hashes)
🚨 AbuseIPDB ABUSEIPDB_API_KEY Free tier available IP abuse confidence scores and report counts
πŸ‘½ AlienVault OTX OTX_API_KEY Free Threat intelligence pulse counts and reputation
🌍 ip-api.com (none) Free IP geolocation for delivery path hops
πŸ“‹ WHOIS (none) Free Domain registration age and registrar info

Getting API Keys

Google Gemini

Sign up at Google AI Studio to get a free API key. The free tier provides generous rate limits suitable for individual use.

export GEMINI_API_KEY="your-key-here"

Gemini provides an AI-powered phishing assessment that includes a verdict (phishing / suspicious / legitimate), confidence percentage, executive summary, technical analysis, attack type classification, and recommended actions. It can also boost the threat score by up to +50 points.

# Use default model (gemini-2.5-flash)
pefa email.eml --gemini

# Use a more capable model
pefa email.eml --gemini --gemini-model gemini-2.5-pro

Note: The --gemini flag is required to activate AI analysis even if GEMINI_API_KEY is set. This keeps AI calls explicit.

urlscan.io

Sign up at urlscan.io for a free account. Navigate to your profile to find your API key.

export URLSCAN_API_KEY="your-key-here"

When suspicious links are detected, PEFA queries urlscan.io for domain reputation data including overall verdict (malicious/suspicious/benign), page metadata, and redirect statistics. Results link directly to the urlscan.io result page for manual investigation.

MXToolbox

Sign up at MXToolbox for an API subscription.

export MXTOOLBOX_API_KEY="your-key-here"

Performs live DNS-based validation of SPF, DKIM, and DMARC records for the sender's domain. This goes beyond parsing email headers β€” it checks the actual DNS configuration. If MXToolbox results contradict the header claims (e.g., headers say DKIM pass but DNS shows a failure), PEFA flags the discrepancy as a warning.

VirusTotal

Sign up at VirusTotal for a free community account. Your API key is available on your profile page.

export VT_API_KEY="your-key-here"

Enriches extracted IOCs with multi-vendor detection results:

  • IPs (up to 5) β€” malicious/suspicious/harmless detection counts and reputation score
  • Domains (up to 5) β€” same detection breakdown plus reputation
  • URLs (up to 3) β€” vendor detection counts
  • File hashes (up to 5) β€” detection counts and meaningful filenames

Free tier: 4 requests/minute, 500 requests/day, 15.5K requests/month.

AbuseIPDB

Sign up at AbuseIPDB for a free account.

export ABUSEIPDB_API_KEY="your-key-here"

Checks IP addresses (up to 5) against AbuseIPDB's crowd-sourced abuse report database. Returns an abuse confidence score (0–100), total number of reports, whitelist status, country, and ISP. Queries cover reports from the last 90 days.

Free tier: 1,000 checks/day.

AlienVault OTX

Sign up at AlienVault OTX for a free account.

export OTX_API_KEY="your-key-here"

Queries the Open Threat Exchange for community-sourced threat intelligence. Returns pulse counts (how many threat intelligence reports reference the IOC) and reputation scores for:

  • IPs (up to 5)
  • Domains (up to 5)
  • URLs (up to 3)
  • File hashes (up to 5)

Setting Up All API Keys

For maximum enrichment, configure all keys in your shell profile (~/.bashrc, ~/.zshrc, etc.):

# Required: set --gemini flag to activate
export GEMINI_API_KEY="your-gemini-key"

# Threat intelligence (activate automatically when set)
export URLSCAN_API_KEY="your-urlscan-key"
export VT_API_KEY="your-virustotal-key"
export ABUSEIPDB_API_KEY="your-abuseipdb-key"
export OTX_API_KEY="your-alienvault-key"

# Email authentication
export MXTOOLBOX_API_KEY="your-mxtoolbox-key"

Then run with full enrichment:

pefa email.eml --html --gemini

API Usage Examples

# Fully offline β€” no API calls at all
pefa email.eml --no-api

# Basic analysis with free APIs only (ip-api.com + WHOIS)
# No env vars needed
pefa email.eml

# Add AI assessment only
export GEMINI_API_KEY="..."
pefa email.eml --gemini

# IOC enrichment with VirusTotal + AbuseIPDB
export VT_API_KEY="..."
export ABUSEIPDB_API_KEY="..."
pefa email.eml --html

# Full enrichment: all APIs + AI + HTML report
export GEMINI_API_KEY="..."
export VT_API_KEY="..."
export ABUSEIPDB_API_KEY="..."
export OTX_API_KEY="..."
export URLSCAN_API_KEY="..."
export MXTOOLBOX_API_KEY="..."
pefa email.eml --html --gemini

# Batch process with full enrichment
pefa ./emails/ -o ./reports/ --html --gemini

How APIs Affect the Report

Without any API keys, PEFA still performs:

  • Header-based SPF/DKIM/DMARC checks
  • Link analysis (mismatches, brand impersonation, homoglyphs, suspicious TLDs)
  • Sender spoofing detection
  • Urgency language scanning
  • Attachment threat assessment
  • Language quality analysis
  • Threat scoring (0–100)

Adding API keys progressively enriches the report:

APIs Configured Additional Report Sections
(none) Base analysis with all local checks
+ GEMINI_API_KEY AI Assessment panel with verdict, confidence, attack classification
+ VT_API_KEY IOC table with VirusTotal detection counts per indicator
+ ABUSEIPDB_API_KEY IP abuse confidence scores in IOC table
+ OTX_API_KEY Threat intelligence pulse counts in IOC table
+ URLSCAN_API_KEY URL reputation verdicts in link analysis section
+ MXTOOLBOX_API_KEY Deep DNS validation results in authentication section

πŸ—οΈ Architecture

.eml file β†’ parser.py β†’ pipeline.run_analysis() β†’ PageRenderer.build() β†’ Playwright β†’ .png/.html
pefa/
β”œβ”€β”€ cli.py                  # CLI argument parsing and entry point
β”œβ”€β”€ parser.py               # .eml parsing and header extraction
β”œβ”€β”€ pipeline.py             # Analysis orchestrator
β”œβ”€β”€ scoring.py              # Weighted threat score calculation
β”œβ”€β”€ highlighting.py         # Email body highlighting (urgency keywords, suspicious links)
β”œβ”€β”€ constants.py            # TLDs, shorteners, extensions, regex patterns, homoglyphs
β”œβ”€β”€ deps.py                 # Centralized optional dependency imports
β”œβ”€β”€ analyzers/
β”‚   β”œβ”€β”€ links.py            # LinkAnalyzer β€” URL and domain analysis
β”‚   β”œβ”€β”€ sender.py           # SenderAnalyzer β€” spoofing and impersonation
β”‚   β”œβ”€β”€ urgency.py          # UrgencyAnalyzer β€” pressure language patterns
β”‚   β”œβ”€β”€ attachments.py      # AttachmentAnalyzer β€” file threat assessment
β”‚   β”œβ”€β”€ language.py         # LanguageAnalyzer β€” text quality and encoding
β”‚   └── ioc_consolidator.py # IOC extraction and enrichment
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ ip_lookup.py        # IP geolocation (ip-api.com)
β”‚   β”œβ”€β”€ gemini.py           # Google Gemini AI assessment
β”‚   β”œβ”€β”€ urlscan.py          # urlscan.io domain reputation
β”‚   β”œβ”€β”€ mxtoolbox.py        # SPF/DKIM/DMARC validation
β”‚   β”œβ”€β”€ whois_client.py     # Domain WHOIS lookup
β”‚   β”œβ”€β”€ virustotal.py       # VirusTotal IOC lookup
β”‚   β”œβ”€β”€ abuseipdb.py        # AbuseIPDB IP reputation
β”‚   └── alienvault.py       # AlienVault OTX intelligence
β”œβ”€β”€ renderers/
β”‚   β”œβ”€β”€ page.py             # Full HTML page assembly
β”‚   └── widgets/            # 13 analysis section widgets
└── templates/
    β”œβ”€β”€ css/                # Dark theme, interactive styling
    └── js/                 # Section navigation, animations, interactivity

πŸ“€ Output

πŸ–ΌοΈ PNG mode (default) produces a single infographic image containing all analysis sections: threat gauge, sender analysis, authentication status, link flags, urgency patterns, attachments, domain age, delivery path, IP geolocation, and the rendered email body in a sandboxed frame.

πŸ“Š HTML mode (--html) additionally produces an interactive report with collapsible sections, scroll-spy navigation, animated gauges, copy-to-clipboard for IOCs, and download/print buttons.

🌐 Web UI (--web) serves a browser-based interface for uploading .eml files and viewing analysis results interactively without needing Playwright installed on the client.

πŸ§ͺ Sample Emails

The samples/ directory contains example phishing emails (419 scams, social engineering, impersonation) for testing. Pre-generated reports are available in examples/.

pefa samples/

πŸ“„ License

See pyproject.toml for package metadata.

About

PHISHING EMAIL FORENSIC ANALYSIS

Resources

Stars

Watchers

Forks

Packages

No packages published