High-performance API for checking if passwords have been compromised in data breaches, using Redis-backed Bloom filters.
- Space Efficient: ~1.7MB for 1M passwords (scales to ~1GB for 600M passwords)
- Fast Lookups: O(k) time complexity for lookups where k is the number of hash functions
- Low False Positive Rate: Configurable accuracy (default 0.1% false positive rate)
- Privacy Focused: Never stores actual passwords, only cryptographic hashes
-
Optimal Size Calculation: Uses mathematical formulas to determine the ideal bit array size and number of hash functions based on expected items and desired false positive rate.
-
Double Hashing: Instead of k independent hash functions, uses 2 base hashes to generate k positions:
- Hash count formula:
k = (m/n) * ln(2) - Reference: Wikipedia - Bloom Filter
- Position generation:
g(x) = h1(x) + i * h2(x) - Reference: Less Hashing, Same Performance
- Hash count formula:
-
Redis Persistence: Stores bit array in Redis using SETBIT and GETBIT operations for fast and persistent access.
A Bloom filter is a probabilistic data structure that provides two possible outcomes:
- Definitely not in set: If any bit is 0, the item was never added
- Possibly in set: If all bits are 1, the item might have been added
For password checking, this behavior is acceptable because:
- False positives are tolerable (better safe than sorry)
- False negatives never occur (will never miss a breached password)
- Core Bloom filter implementation with optimal parameters
- FastAPI endpoints with JSON request/response models
- Docker containerization (Redis + API)
- Statistics endpoint for monitoring
- Automatic connection retry logic
- Environment-based configuration (development/production)
- Comprehensive test suite
- Production deployment configurations
POST /check
Content-Type: application/json
{
"password": "string"
}Response:
{
"compromised": boolean,
"message": "string"
}POST /add
Content-Type: application/json
{
"password": "string"
}Response:
{
"added": boolean,
"message": "string"
}GET /statsResponse:
{
"bit_size": 14377588,
"bits_set": 40,
"num_hashes": 7,
"expected_items": 1000000,
"false_positive_rate": 0.001,
"memory_usage_mb": 1.72
}GET /healthResponse:
{
"status": "healthy",
"message": "string"
}- Python 3.8+
- Redis server
- Docker (optional)
- Clone the repository
- Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
cd backend pip install -r requirements.txt - Start Redis server
- Run the application:
uvicorn app.main:app --reload
docker-compose up --builddocker-compose -f deployment/docker-compose.prod.yml up -dThe application uses environment variables for configuration:
ENVIRONMENT: Set to "production" for production deploymentREDIS_HOST: Redis server hostnameREDIS_PORT: Redis server portREDIS_PASSWORD: Redis authentication passwordBLOOM_EXPECTED_ITEMS: Expected number of items in the filterBLOOM_FALSE_POSITIVE_RATE: Desired false positive rate
See backend/env.example for a complete list of configuration options.
Run the test suite:
cd backend
pytest tests/ -v