Thanks to visit codestin.com
Credit goes to github.com

Skip to content

HarmJ0y/PhishingSpamDataSet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PhishingSpamDataSet

Phishing and Spam Email DataSet

Phishing and Spam Email DataSet

A multi-layered open-science dataset for phishing, spam, and legitimate email analysis using emotional, motivational, and semantic labels.

Overview

This repository contains a new, richly annotated dataset designed for research on LLM-based email security, including phishing detection, spam analysis, emotional manipulation, and automated robustness evaluation under paraphrasing.

The dataset includes:

  • Human-written phishing, spam, and legitimate emails
  • LLM-generated emails (GPT-4o, DeepSeek-Chat, Grok, Llama 3.3, Gemini, Nova, Mistral, etc.)
  • Emotion and motivation labels
  • Rephrased/paraphrased variants from three independent LLM pipelines
  • Claude 3.5 Sonnet classifications

This repository enables reproducible research on how LLMs interpret, classify, and analyze deceptive online communication.

Contents

Dataset

merged_emails_with_categories.jsonl
Contains:

  • True category (Phishing, Spam, Valid)
  • Human vs. LLM-generated origin
  • Rephrasing source (GPT-4o, DeepSeek, RandomAPI, Manual)
  • Emotional labels (urgency, fear, authority, etc.)
  • Motivational labels (link-click, credential theft, etc.)
  • Claude 3.5 Sonnet predicted classification

Scripts

  • accuracy_validation.py — Benchmark emotional & motivational detection
  • category.py — Email classification pipeline
  • stats.py — Strict/relaxed accuracy, confusion matrices, paraphrase robustness

Results

  • Confusion matrices
  • Strict and relaxed classification reports
  • Emotional/motivational LLM benchmarking
  • Robustness metrics across rephrasing pipelines

Abstract

Phishing and spam emails remain pervasive cybersecurity threats, increasingly strengthened by the use of Large Language Models (LLMs) to generate deceptive content. This work introduces a comprehensive, multi-layered email dataset containing both human-written and LLM-generated messages across phishing, spam, and legitimate categories. Each email is enriched with emotional and motivational labels—capturing cues such as urgency, fear, authority, greed, and link-click incentives—along with paraphrased variants generated by multiple LLM pipelines to test classifier robustness.

We benchmark several modern LLMs for emotional and motivational detection and identify Claude 3.5 Sonnet as the most reliable model for large-scale annotation. We further evaluate its classification accuracy under both strict (three-class) and relaxed (unwanted vs. valid) settings across original and LLM-rewritten emails. Results show that contemporary LLMs can reliably detect harmful messages and emotional manipulation strategies, though distinguishing spam from legitimate emails remains difficult.

All templates, datasets, and source code are released openly to support reproducible research in AI-assisted email security.

Methodology

Methodology

1. Dataset Construction

  • Human-written emails collected from open-source corpora and curated phishing repositories
  • LLM-generated emails created for diversity
  • Rephrasing via three pipelines:
    • DeepSeek-Chat
    • GPT-4o
    • OpenRouter multi-model pipeline (Gemini, Nova, Grok, Llama, Mistral, etc.)

2. Emotional & Motivational Labeling

Four LLMs evaluated:

  • GPT-4o-mini
  • GPT-4.1-mini
  • Claude 3.5 Sonnet
  • DeepSeek-Chat

Evaluation metrics:

  • Strict accuracy
  • Close-enough accuracy
  • Jaccard similarity
  • Internal consistency across 5 independent runs
  • Precision & recall

Claude 3.5 Sonnet was selected for full-dataset labeling due to highest match with human annotations.

3. Email Classification

Claude 3.5 Sonnet performed final classification using:

  • Email body
  • Subject line
  • Sender metadata
  • URL and attachment indicators

Evaluated using:

  • Strict classification (Phishing / Spam / Valid)
  • Relaxed classification (Unwanted vs. Valid)
  • Robustness to paraphrasing across three pipelines

Key Findings

Emotional & Motivational Analysis

  • Claude 3.5 Sonnet:
    • Jaccard similarity = 0.60
    • Close-enough accuracy = 42%
  • Motivational detection harder, but top models achieve 53–61% close-enough accuracy
  • LLMs often infer additional plausible motivations beyond human annotations

Email Classification

Across all email groups (Original, DeepSeek-rephrased, GPT-4o-rephrased, RandomAPI):

  • Strict accuracy: ~66–67%
  • Relaxed accuracy: ~69–70%
  • Phishing detection excellent (F1 ≈ 0.93)
  • Spam detection weak (F1 ≈ 0.20–0.23)
  • Valid classification moderate (F1 ≈ 0.63)

Robustness to Paraphrasing

Maximum deviation from original:

  • Strict accuracy deviation: 0.55 percentage points
  • Relaxed accuracy deviation: 0.54 percentage points

Rephrasing has minimal impact on classifier performance.

Reproducibility

Running stats.py produces:

  • Strict and relaxed accuracy
  • Confusion matrices
  • Group-by-group metrics
  • Paraphrasing robustness analysis
  • LaTeX-ready tables for publications

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%