AegisGuard-AI

<<<<<<< HEAD

AegisGuard-AI

=======

Transformer-based Web Application Firewall (WAF) Pipeline

🎯 Project Overview

Production-ready WAF system using Transformer ML models to detect anomalous web traffic by learning benign patterns from DVWA, OWASP Juice Shop, and WebGoat applications.

SIH 2025 Problem Statement: PS-25172 (Department of Space, SAC)

📁 Project Structure

waf-pipeline/
├── ingest/              # Log ingestion (Filebeat + Kafka consumer)
├── parser/              # Log parsing & normalization service
├── trainer/             # ML model training (Transformer + LoRA)
├── inference/           # FastAPI inference service
├── gateway/             # Node.js detection controller
├── dashboard/           # MERN dashboard (React + Node API)
├── blockchain/          # Solidity smart contract & gateway
├── infra/               # Docker Compose + Kubernetes manifests
├── demo/                # Demo scripts & attack generation
├── ci/                  # GitHub Actions workflows
└── docs/                # Architecture diagrams & API specs

🚀 Quick Start

Prerequisites

Docker & Docker Compose
Node.js 18+
Python 3.10+
CUDA-capable GPU (optional, for training)

Local POC Setup

Start all services:
```
cd infra
docker-compose up -d
```

Generate benign training data:

python demo/generate_benign.py --app dvwa --count 100000 --seed 42

Train model:

docker exec -it waf-trainer python train.py --config configs/transformer_ae.yml

Access dashboard:
```
http://localhost:3000
```
Test with attack injection:
```
bash demo/attack_inject.sh
```

🏗️ Architecture

Data Flow

Web Traffic → Nginx (OpenResty) → Mirror Request → Gateway
                ↓                                      ↓
            Upstream App                    Inference Service
                ↓                                      ↓
            Access Logs → Filebeat → Kafka → Parser → MinIO/S3
                                                       ↓
                                                   Trainer

Components

1. Ingest Service

Filebeat config for log collection
Kafka consumer for real-time streaming
Batch upload endpoint

2. Parser Service

Normalizes Apache/Nginx logs
Masks sensitive data (IDs, emails, tokens)
Path templating & tokenization

3. Trainer Service

Transformer Autoencoder (6 layers, 512 hidden)
LoRA fine-tuning for incremental learning
Export to TorchScript & ONNX

4. Inference Service

FastAPI with <30ms p50 latency
Anomaly scoring endpoint
Model versioning & hot reload

5. Gateway Service

Async request observation
Redis blocklist management
MongoDB alert storage

6. Dashboard

Real-time detection visualization
Threshold controls
Metrics & alerts

7. Blockchain Module

Polygon Mumbai smart contract
Tamper-proof detection records
IPFS metadata storage

🔧 Configuration

Environment Variables

Create .env file:

# Kafka
KAFKA_BOOTSTRAP_SERVERS=localhost:9092

# MongoDB
MONGO_URI=mongodb://localhost:27017/waf

# Redis
REDIS_URL=redis://localhost:6379

# Model
MODEL_VERSION=v1.0.0
MODEL_PATH=/models/transformer_ae.pt
ANOMALY_THRESHOLD=3.0

# Blockchain
POLYGON_RPC_URL=https://rpc-mumbai.maticvigil.com
PRIVATE_KEY=your_private_key_here

# Security
JWT_SECRET=your_jwt_secret_here

📊 Model Details

Transformer Autoencoder

Architecture: 6-layer encoder/decoder
Hidden dim: 512
Attention heads: 8
Max sequence length: 256 tokens
Training data: 1M benign requests

Anomaly Detection

Score: Reconstruction loss + token-level anomalies
Warning threshold: μ + 3σ
Block threshold: μ + 6σ

🧪 Testing

Run tests

# Unit tests
pytest parser/tests/
pytest inference/tests/

# Integration tests
pytest tests/e2e_test.py

# Load tests
locust -f tests/locustfile.py

📈 Monitoring

Prometheus metrics: :9090/metrics
Grafana dashboard: :3001
Inference health: :8000/infer/health

🔐 Security

TLS for all inter-service communication
JWT authentication
Rate limiting per IP
Secrets via environment variables

📝 API Documentation

Inference API: http://localhost:8000/docs (OpenAPI)
Gateway API: http://localhost:4000/api-docs

🤝 Contributing

Follow existing code structure
Add tests for new features
Update documentation
Run linters before committing

📄 License

MIT License - See LICENSE file for details

👥 Team

SIH 2025 - Department of Space, SAC

shubham

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
blockchain		blockchain
dashboard		dashboard
data-collector		data-collector
demo		demo
docs		docs
gateway		gateway
inference		inference
infra		infra
ingest		ingest
parser		parser
trainer		trainer
training		training
.gitignore		.gitignore
COMPLETED.md		COMPLETED.md
DEVELOPMENT_GUIDE.md		DEVELOPMENT_GUIDE.md
ML_PIPELINE_GUIDE.md		ML_PIPELINE_GUIDE.md
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
START_HERE.md		START_HERE.md
detail.txt		detail.txt

adityashriwas/AegisGuard-AI

Folders and files

Latest commit

History

Repository files navigation

AegisGuard-AI

Transformer-based Web Application Firewall (WAF) Pipeline

🎯 Project Overview

📁 Project Structure

🚀 Quick Start

Prerequisites

Local POC Setup

🏗️ Architecture

Data Flow

Components

1. Ingest Service

2. Parser Service

3. Trainer Service

4. Inference Service

5. Gateway Service

6. Dashboard

7. Blockchain Module

🔧 Configuration

Environment Variables

📊 Model Details

Transformer Autoencoder

Anomaly Detection

🧪 Testing

Run tests

📈 Monitoring

🔐 Security

📝 API Documentation

🤝 Contributing

📄 License

👥 Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages