Releases: siva-sub/client-ocr
v2.0.0 - PPU Models, 100+ Languages & Major Improvements
π Client-Side OCR v2.0.0
This major release brings PPU PaddleOCR model support, extends language support to 100+ languages, and includes critical fixes for model recognition and performance.
β¨ New Features
- PPU PaddleOCR Model Support: Full support for PPU PaddleOCR models with specialized preprocessing
- Extended Language Support: Expanded from 14 to 100+ languages with comprehensive model coverage
- Stack Overflow Prevention: Safe handling of large documents without memory errors
- Enhanced Documentation:
- Usage Guide with examples
- API Reference
- Troubleshooting Guide
- Model Documentation
- Model Width Limiting: Automatic width limiting for PPU models to prevent memory issues
- Improved Error Handling: Better error messages and recovery strategies
π Bug Fixes
- PPU Model Recognition: Fixed critical issue where PPU models were returning gibberish text instead of correct predictions
- Implemented proper grayscale conversion (red channel only)
- Fixed dictionary indexing (0-based instead of 1-based)
- Stack Overflow Errors: Fixed "Maximum call stack size exceeded" errors when processing large documents
- Replaced spread operators with loops for large arrays
- Made debug output safer by skipping operations on large tensors
- Memory Management: Improved memory handling for large image processing
- TypeScript Compatibility: Fixed Float32Array type issues
π Documentation
- Added comprehensive usage documentation with real-world examples
- Created detailed API reference for all classes and methods
- Documented common problems and their solutions
- Added model architecture and selection guide
π§ Technical Details
- PPU models now use red channel only for grayscale conversion
- PPU models use 0-based dictionary indexing
- Maximum width limited to 800px for PPU models
- Safer array operations throughout the codebase
- Enhanced preprocessing pipeline with model-specific normalization
π¦ Installation
```bash
npm install [email protected]
```
π Quick Start
```typescript
import { createRapidOCREngine } from 'client-side-ocr';
const ocr = createRapidOCREngine({
language: 'en', // or any of 100+ languages
modelVersion: 'PP-OCRv4'
});
await ocr.initialize();
const result = await ocr.processImage(imageFile);
console.log(result.text);
```
π Acknowledgments
Special thanks to the RapidOCR and PaddleOCR teams for their excellent models and to the ppu-paddle-ocr project for TypeScript implementation references.
π Full Changelog
See CHANGELOG.md for detailed changes.
Release v1.3.0 - Table Detection & Layout Analysis
π Release v1.3.0 - Enhanced OCR Suite
β¨ New Features
π Table Detection (RapidTable Integration)
- PP-Structure models for English and Chinese table recognition
- SLANet+ model for enhanced accuracy
- HTML table output with cell detection
- Automatic table structure extraction
π Layout Analysis (RapidLayout Integration)
- PP Layout CDLA model for document analysis
- YOLOv8n Layout model for academic papers
- DocLayout-YOLO for DocStructBench
- Detects text, titles, tables, figures, and formulas
π§ Unified Model Registry
- Centralized management of all OCR models
- Local models from OnnxOCR directory
- Local models from ppu-paddle-ocr directory
- PPU Paddle OCR English mobile set as default
- Easy model switching via Model Manager tab
π¨ Enhanced UI
- New processing mode selector (OCR, Table, Layout, All-in-One)
- Model Manager tab with GitHub links to model sources
- Model source information display
- Configurable model defaults
- Processing history tracking
π Changes
- Updated PP-OCRv5 model URLs to use master branch
- Enhanced OCR interface with unified features
- Improved model selection UI with source information
- Better performance with local model loading
π Performance
- Local models loaded directly from disk for faster performance
- Web Workers for parallel processing
- Optimized model loading and caching
π¦ Installation
NPM Package:
```bash
npm install [email protected]
```
Live Demo:
https://siva-sub.github.io/client-ocr/
π Acknowledgments
- RapidAI team for RapidTable and RapidLayout
- OnnxOCR for local ONNX models
- ppu-paddle-ocr for optimized mobile models