Thanks to visit codestin.com
Credit goes to github.com

Skip to content

adityashriwas/AegisGuard-AI

Repository files navigation

<<<<<<< HEAD

AegisGuard-AI

=======

Transformer-based Web Application Firewall (WAF) Pipeline

🎯 Project Overview

Production-ready WAF system using Transformer ML models to detect anomalous web traffic by learning benign patterns from DVWA, OWASP Juice Shop, and WebGoat applications.

SIH 2025 Problem Statement: PS-25172 (Department of Space, SAC)


πŸ“ Project Structure

waf-pipeline/
β”œβ”€β”€ ingest/              # Log ingestion (Filebeat + Kafka consumer)
β”œβ”€β”€ parser/              # Log parsing & normalization service
β”œβ”€β”€ trainer/             # ML model training (Transformer + LoRA)
β”œβ”€β”€ inference/           # FastAPI inference service
β”œβ”€β”€ gateway/             # Node.js detection controller
β”œβ”€β”€ dashboard/           # MERN dashboard (React + Node API)
β”œβ”€β”€ blockchain/          # Solidity smart contract & gateway
β”œβ”€β”€ infra/               # Docker Compose + Kubernetes manifests
β”œβ”€β”€ demo/                # Demo scripts & attack generation
β”œβ”€β”€ ci/                  # GitHub Actions workflows
└── docs/                # Architecture diagrams & API specs

πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose
  • Node.js 18+
  • Python 3.10+
  • CUDA-capable GPU (optional, for training)

Local POC Setup

  1. Start all services:

    cd infra
    docker-compose up -d
  2. Generate benign training data:

    python demo/generate_benign.py --app dvwa --count 100000 --seed 42
  3. Train model:

    docker exec -it waf-trainer python train.py --config configs/transformer_ae.yml
  4. Access dashboard:

    http://localhost:3000
    
  5. Test with attack injection:

    bash demo/attack_inject.sh

πŸ—οΈ Architecture

Data Flow

Web Traffic β†’ Nginx (OpenResty) β†’ Mirror Request β†’ Gateway
                ↓                                      ↓
            Upstream App                    Inference Service
                ↓                                      ↓
            Access Logs β†’ Filebeat β†’ Kafka β†’ Parser β†’ MinIO/S3
                                                       ↓
                                                   Trainer

Components

1. Ingest Service

  • Filebeat config for log collection
  • Kafka consumer for real-time streaming
  • Batch upload endpoint

2. Parser Service

  • Normalizes Apache/Nginx logs
  • Masks sensitive data (IDs, emails, tokens)
  • Path templating & tokenization

3. Trainer Service

  • Transformer Autoencoder (6 layers, 512 hidden)
  • LoRA fine-tuning for incremental learning
  • Export to TorchScript & ONNX

4. Inference Service

  • FastAPI with <30ms p50 latency
  • Anomaly scoring endpoint
  • Model versioning & hot reload

5. Gateway Service

  • Async request observation
  • Redis blocklist management
  • MongoDB alert storage

6. Dashboard

  • Real-time detection visualization
  • Threshold controls
  • Metrics & alerts

7. Blockchain Module

  • Polygon Mumbai smart contract
  • Tamper-proof detection records
  • IPFS metadata storage

πŸ”§ Configuration

Environment Variables

Create .env file:

# Kafka
KAFKA_BOOTSTRAP_SERVERS=localhost:9092

# MongoDB
MONGO_URI=mongodb://localhost:27017/waf

# Redis
REDIS_URL=redis://localhost:6379

# Model
MODEL_VERSION=v1.0.0
MODEL_PATH=/models/transformer_ae.pt
ANOMALY_THRESHOLD=3.0

# Blockchain
POLYGON_RPC_URL=https://rpc-mumbai.maticvigil.com
PRIVATE_KEY=your_private_key_here

# Security
JWT_SECRET=your_jwt_secret_here

πŸ“Š Model Details

Transformer Autoencoder

  • Architecture: 6-layer encoder/decoder
  • Hidden dim: 512
  • Attention heads: 8
  • Max sequence length: 256 tokens
  • Training data: 1M benign requests

Anomaly Detection

  • Score: Reconstruction loss + token-level anomalies
  • Warning threshold: ΞΌ + 3Οƒ
  • Block threshold: ΞΌ + 6Οƒ

πŸ§ͺ Testing

Run tests

# Unit tests
pytest parser/tests/
pytest inference/tests/

# Integration tests
pytest tests/e2e_test.py

# Load tests
locust -f tests/locustfile.py

πŸ“ˆ Monitoring

  • Prometheus metrics: :9090/metrics
  • Grafana dashboard: :3001
  • Inference health: :8000/infer/health

πŸ” Security

  • TLS for all inter-service communication
  • JWT authentication
  • Rate limiting per IP
  • Secrets via environment variables

πŸ“ API Documentation

  • Inference API: http://localhost:8000/docs (OpenAPI)
  • Gateway API: http://localhost:4000/api-docs

🀝 Contributing

  1. Follow existing code structure
  2. Add tests for new features
  3. Update documentation
  4. Run linters before committing

πŸ“„ License

MIT License - See LICENSE file for details


πŸ‘₯ Team

SIH 2025 - Department of Space, SAC

shubham

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •