A production-ready FastAPI backend service for OCR processing, signature detection, PII identification, and image recognition with AI-powered validation.
nasscom.mp4
- π OCR Processing - Extract text from documents and PDFs using EasyOCR
- πΌοΈ Image Detection - Detect and locate pictures/objects within documents using YOLO
- βοΈ Signature Detection - Identify signatures using custom YOLO models
- π‘οΈ PII Detection - Find and classify personally identifiable information
- π€ AI Validation - LLM-powered validation using Google Gemini
- π Multi-language Support - OCR in multiple languages
- π Privacy-First - Configurable LLM usage (off by default)
- Python 3.8+
- Virtual environment (recommended)
- Google Gemini API key (optional, for LLM features)
-
Clone and setup
cd backend-app python -m venv venv # Windows .\venv\Scripts\activate # Linux/Mac source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt python -m spacy download en_core_web_sm
-
Configure environment
cp .env.production .env # Edit .env with your settings -
Start server
python start_server.py
# Build and run
docker-compose up -d
# Check health
curl http://localhost:8000/health- Interactive Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
- Process Document:
POST /process_document
curl -X POST "http://localhost:8000/process_document" \
-F "[email protected]" \
-F "use_llm=false"{
"ocr": {
"pages": [
{
"page_number": 1,
"blocks": [
{
"text": "Sample text",
"confidence": 0.95,
"position": {
"top_left": [100, 150],
"top_right": [200, 150],
"bottom_right": [200, 200],
"bottom_left": [100, 200]
}
}
]
}
]
},
"signatures": [
{
"bbox": {"x1": 100, "y1": 150, "x2": 200, "y2": 200},
"confidence": 0.85
}
],
"pii_detection": [
{
"type": "PERSON",
"value": "John Doe",
"confidence": 0.9,
"bbox": {"x1": 100, "y1": 150, "x2": 200, "y2": 200}
}
],
"detected_images": [
{
"page_number": 1,
"images": [
{
"type": "person",
"confidence": 0.85,
"position": {
"top_left": [100, 150],
"top_right": [200, 150],
"bottom_right": [200, 250],
"bottom_left": [100, 250]
}
}
]
}
]
}backend-app/
βββ api/
β βββ main.py # FastAPI endpoints & routing
βββ ocr/
β βββ processor.py # OCR + image detection
β βββ processor_stream.py # Streaming OCR
βββ pii_detection/
β βββ detector.py # PII entity detection
β βββ models.py # Data models
β βββ bbox_mapper.py # Coordinate mapping
β βββ indian_recognizers.py # India-specific PII
β βββ llm_validator.py # AI validation
βββ pipeline/
β βββ orchestrator.py # Processing coordination
βββ config/
β βββ settings.py # Configuration
β βββ logging.py # Logging setup
βββ models/ # ML model storage
βββ logs/ # Application logs
βββ tests/ # Test files
# Required
GEMINI_API_KEY=your_gemini_api_key_here
# Optional
OCR_GPU_ENABLED=false
MAX_FILE_SIZE=10485760
DEBUG=false
LOG_LEVEL=INFO- OCR: EasyOCR with CPU/GPU support
- Signature Detection:
detector_yolo_1cls.pt(custom YOLO model) - Image Detection: YOLOv8n (auto-downloaded)
- PII Detection: Presidio + custom recognizers
# Deploy to Railway
railway login
railway link
railway up# Build and deploy
gcloud builds submit --tag gcr.io/PROJECT_ID/pii-backend
gcloud run deploy --image gcr.io/PROJECT_ID/pii-backend --platform managed# Deploy using render.yaml configuration
# Push to GitHub and connect to Render# Deploy to Heroku
heroku create your-app-name
git push heroku mainpython -m pytest tests/python test_image_detection.py# Test OCR
curl -X POST "http://localhost:8000/process_document" \
-F "file=@test_image.jpg"
# Test with LLM validation
curl -X POST "http://localhost:8000/process_document" \
-F "file=@test_image.jpg" \
-F "use_llm=true"- LLM Usage: Disabled by default, requires explicit activation
- File Validation: Strict file type and size validation
- CORS Protection: Configurable CORS policies
- Input Sanitization: Comprehensive input validation
- Error Handling: Secure error messages (no sensitive data leakage)
- Model Caching: Intelligent caching for YOLO and EasyOCR models
- Memory Management: Automatic garbage collection and cleanup
- Concurrent Processing: Async FastAPI for high throughput
- Resource Limits: Configurable memory and processing limits
-
Import Errors
# Set Python path export PYTHONPATH=. python pipeline/orchestrator.py
-
GPU Issues
# Disable GPU if issues occur export OCR_GPU_ENABLED=false
-
Memory Issues
# Reduce concurrent processing export MAX_WORKERS=1
# Method 1: Set PYTHONPATH (Windows PowerShell)
$env:PYTHONPATH = '.'
python pipeline/orchestrator.py <image_path> [llm_api_key]
# Method 2: Use Python -m
python -m pipeline.orchestrator <image_path> [llm_api_key]# Check application logs
tail -f logs/app.log
# Docker logs
docker logs container_name