PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs β designed to support data privacy and GDPR compliance.
β οΈ This is a personal side project
Built to explore how Large Language Models can detect sensitive data in logs more intelligently than traditional regex-based approaches.
- About
- Why Use LLMs for PII Detection?
- PII Types Detected
- Architecture
- Getting Started
- Try It Out
- How to Test
- Project Structure
- Suggestions & Contributions
This project experiments with Large Language Models (LLMs) β specifically the gemma:3b model running locally via Ollama β to evaluate how effectively they can identify PII in both structured and unstructured log data.
π§ LLM-Based Detection with Ollama
- Uses
gemma:3bthrough the Ollama runtime- Analyzes logs using natural language understanding
- Handles real-world, messy logs better than regex
- Work in progress β contributions welcome!
- π Identifies PII even when it's obfuscated, incomplete, or embedded in text
- π Handles multilingual input and inconsistent formats
- π§ Leverages semantic context instead of relying on static patterns
- π§ͺ Ideal for experimenting with privacy tooling powered by AI
Traditional detection rules often break under complexity β LLMs provide contextual intelligence.
full-name, first-name, last-name, username, email, phone-number, mobile, address, postal-code, location
racial-or-ethnic-origin, political-opinion, religious-belief, philosophical-belief, trade-union-membership, genetic-data, biometric-data, health-data, sex-life, sexual-orientation
national-id, passport-number, driving-license-number, ssn, vat-number, credit-card, iban, bank-account
ip-address, ip-addresses, mac-address, imei, device-id, device-metadata, browser-fingerprint, cookie-id, location-coordinates
license-plate
This is how PII Guard works:
- Clone the repo and start everything with a single command:
make all-in-up- Shut down everything with:
make all-in-downThis will launch the full stack:
- π PostgreSQL
- π Elasticsearch
- π RabbitMQ
- π€ Ollama (with gemma:3b)
- π PII Guard dashboard and backend API
Visit: http://localhost:3000
http://localhost:8888/api/jobs
curl --location 'http://localhost:8888/api/jobs/flush' \
--header 'Content-Type: application/json' \
--data-raw '{
  "version": "1.0.0",
  "logs": [
    "{\"timestamp\":\"2025-04-21T15:02:10Z\",\"service\":\"auth-service\",\"level\":\"INFO\",\"event\":\"user_login\",\"requestId\":\"1a9c7e21\",\"user\":{\"id\":\"u9001001\",\"name\":\"Leila Park\",\"email\":\"[email protected]\"},\"srcIp\":\"198.51.100.15\"}",
    "{\"timestamp\":\"2025-04-21T15:02:12Z\",\"service\":\"cache-service\",\"level\":\"DEBUG\",\"event\":\"cache_miss\",\"requestId\":\"82c5cc9f\",\"cacheKey\":\"product_44291_variant_blue\",\"region\":\"us-east-1\"}"
  ]
}'Please refer to the Testing PII Guard guide for instructions on running the test setup, including simulated log generation and stress testing.
This guide will help you set up a test environment to evaluate the performance and detection accuracy of PII Guard.
- API: api/
- Dashboard: ui/
- LLM Prompt Template: api/src/prompt/pii.prompt.ts
Got a bug to report? Feature request? Wild idea? Bring it on!
- π Bug reports help improve stability
- β¨ Feature requests help shape the product
- π¬ Suggestions, feedback, and contributions are all welcome!