Sentinel Gateway

A high-performance, distributed API Gateway with resilient rate limiting, built with Java 21 and Netty. This project is designed to protect internal services from abuse and traffic spikes, ensuring P99 latency ≤ 5ms and 99.99% availability through a "Fail-Closed" architecture.

Architecture Overview

The project implements a high-throughput, non-blocking gateway using Netty, with a hybrid rate-limiting strategy backed by Redis and local caching.

Application: A high-performance HTTP server built with Java 21 and Netty.
Rate Limiting: Token Bucket algorithm implemented via Lua scripts in Redis for atomicity.
State Management: Hybrid approach using Redis (Global State) and Local Cache (Hot Path).
Containerization: Multi-stage Docker builds for lean, production-ready images.
Orchestration: k3d (Kubernetes) for local cluster orchestration.
Ingress: NGINX Ingress Controller manages external access.
Observability: Micrometer exposes Prometheus metrics for latency, decisions, and errors.

Architectural Decisions

1. Fail-Closed Posture

What: When the system cannot safely determine if a request should be allowed (e.g., Redis timeout or error), the request is denied.

Why:

The gateway is a protection mechanism, not just a delivery mechanism.
Allowing abusive traffic during a failure could cascade to internal services.
Blocking legitimate traffic is a local, recoverable issue; crashing downstream services is a systemic failure.

2. Hybrid Token Bucket (Hot vs. Cold Path)

What: We use a two-tier state strategy to minimize latency.

How:

Hot Path (Local Cache): The gateway first checks a local ConcurrentHashMap. If tokens are available locally, they are consumed immediately. This path involves zero network calls, ensuring sub-millisecond latency.
Cold Path (Redis Sync): If the local bucket is empty or missing, the gateway calls Redis. A Lua script atomically recalculates tokens and syncs the state back to the local instance.

Why:

Performance: Redis cannot be in the critical path of every request if we want P99 ≤ 5ms.
Scalability: Reduces load on Redis by handling the majority of traffic locally.

3. Deterministic Concurrency Control

What:

Local: Fine-grained StampedLock per bucket to handle concurrent threads within the same instance.
Distributed: Redis Lua scripts ensure atomicity across multiple gateway instances.

Why:

Prevents race conditions where multiple requests could consume the same token.
Ensures the global rate limit is respected (within a 1-2% acceptable drift margin).

Known Issues, Trade-offs, and Future Improvements

Trade-offs

Consistency vs. Latency: We accept a small margin of error (1-2% overshoot) in exchange for extreme speed. The local cache might slightly lag behind the global state.
Fail-Closed Impact: In the event of a total Redis failure, traffic will be blocked once local tokens run out. This is a conscious decision to prioritize system stability over availability during outages.
Java GC: Using Java implies managing Garbage Collection. We mitigate this by minimizing allocations in the hot path, but GC pauses are still a factor.

Future Improvements

Adaptive Rate Limiting: Adjust limits dynamically based on downstream health.
Sharding: Shard Redis to handle even higher throughput.
Billing Integration: Connect usage metrics to a billing system.

Project Structure

.
├── src/
│   ├── main/
│   │   ├── java/com/sentinel/
│   │   │   ├── server/          # Netty Server & Handlers (HTTP, RateLimit, Metrics)
│   │   │   ├── ratelimit/       # Core Logic (Service, LocalTokenBucket)
│   │   │   └── infrastructure/  # Redis Manager & Configuration
│   │   └── resources/
│   │       └── scripts/         # Lua scripts for Redis
├── k6/                          # Load testing scripts
│   └── k6-load-test.js
├── k8s/                         # Kubernetes manifests
│   ├── deployment.yaml
│   ├── service.yaml
│   └── ingress.yaml
├── ci.Dockerfile                # Build environment
├── Dockerfile                   # Runtime image
├── build.gradle                 # Project dependencies
├── k3d-config.yaml              # Local cluster config
└── Makefile                     # Automation scripts

Prerequisites

Ensure you have the following tools installed on your system:

Docker
kubectl
k3d (v5.0.0 or newer)
Make
Java 21 (Optional, for local dev)

Configuration

The application can be configured via environment variables.

Variable	Description	Default
`REDIS_URI`	Connection string for Redis.	`redis://localhost:6379`
`PORT`	HTTP Server port.	`8081`
`REDIS_TIMEOUT_MS`	Timeout for Redis operations. Set to `50` for local/Docker environments, `2` for production.	`2`

Getting Started

A Makefile is provided to automate the entire lifecycle.

1. Full Setup (Recommended)

This command creates the cluster, builds the image, installs ingress, and deploys the app.

make run

The gateway will be accessible at http://localhost:8085.

2. Run Load Tests

We use k6 to simulate traffic for Free, Pro, and Enterprise plans.

# Run the load test container
docker run --rm -i --add-host=host.docker.internal:host-gateway grafana/k6 run - < k6/k6-load-test.js

3. View Metrics

Watch the rate limiting metrics update in real-time:

watch -n 1 "curl -s http://localhost:8085/metrics | grep gateway_"

Cleaning Up

To delete the cluster and remove resources:

make clean

API Endpoints

Protected Resource

The main endpoint protected by the rate limiter.

Endpoint: GET /api/resource
Headers:
- X-API-Key: (Optional)
  - No key: Free Plan (10 req/s)
  - pro_...: Pro Plan (50 req/s)
  - enterprise_...: Enterprise Plan (500 req/s)
Response:
- 200 OK: Request allowed.
- 429 Too Many Requests: Rate limit exceeded.

Metrics

Prometheus metrics endpoint.

Endpoint: GET /metrics
Key Metrics:
- gateway_decision_local_total: Decisions made by local cache.
- gateway_decision_redis_total: Decisions requiring Redis sync.
- gateway_decision_denied_total: Blocked requests.
- gateway_request_latency_seconds: Latency distribution.

Rate Limiting Plans

Plan	Burst Capacity	Refill Rate	Identifier
Free	20 tokens	10 req/s	IP Address
Pro	100 tokens	50 req/s	API Key
Enterprise	1000 tokens	500 req/s	API Key (`enterprise_*`)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
gradle/wrapper		gradle/wrapper
k6		k6
k8s		k8s
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
build.gradle		build.gradle
ci.Dockerfile		ci.Dockerfile
gradlew		gradlew
gradlew.bat		gradlew.bat
k3d-config.yaml		k3d-config.yaml
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sentinel Gateway

Architecture Overview

Architectural Decisions

1. Fail-Closed Posture

2. Hybrid Token Bucket (Hot vs. Cold Path)

3. Deterministic Concurrency Control

Known Issues, Trade-offs, and Future Improvements

Trade-offs

Future Improvements

Project Structure

Prerequisites

Configuration

Getting Started

1. Full Setup (Recommended)

2. Run Load Tests

3. View Metrics

Cleaning Up

API Endpoints

Protected Resource

Metrics

Rate Limiting Plans

About

Uh oh!

Releases 2

Packages

Languages

diogomassis/sentinel-gateway

Folders and files

Latest commit

History

Repository files navigation

Sentinel Gateway

Architecture Overview

Architectural Decisions

1. Fail-Closed Posture

2. Hybrid Token Bucket (Hot vs. Cold Path)

3. Deterministic Concurrency Control

Known Issues, Trade-offs, and Future Improvements

Trade-offs

Future Improvements

Project Structure

Prerequisites

Configuration

Getting Started

1. Full Setup (Recommended)

2. Run Load Tests

3. View Metrics

Cleaning Up

API Endpoints

Protected Resource

Metrics

Rate Limiting Plans

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages