Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Releases: siva-sub/client-ocr

v2.0.0 - PPU Models, 100+ Languages & Major Improvements

29 Jul 06:29

Choose a tag to compare

πŸš€ Client-Side OCR v2.0.0

This major release brings PPU PaddleOCR model support, extends language support to 100+ languages, and includes critical fixes for model recognition and performance.

✨ New Features

  • PPU PaddleOCR Model Support: Full support for PPU PaddleOCR models with specialized preprocessing
  • Extended Language Support: Expanded from 14 to 100+ languages with comprehensive model coverage
  • Stack Overflow Prevention: Safe handling of large documents without memory errors
  • Enhanced Documentation:
  • Model Width Limiting: Automatic width limiting for PPU models to prevent memory issues
  • Improved Error Handling: Better error messages and recovery strategies

πŸ› Bug Fixes

  • PPU Model Recognition: Fixed critical issue where PPU models were returning gibberish text instead of correct predictions
    • Implemented proper grayscale conversion (red channel only)
    • Fixed dictionary indexing (0-based instead of 1-based)
  • Stack Overflow Errors: Fixed "Maximum call stack size exceeded" errors when processing large documents
    • Replaced spread operators with loops for large arrays
    • Made debug output safer by skipping operations on large tensors
  • Memory Management: Improved memory handling for large image processing
  • TypeScript Compatibility: Fixed Float32Array type issues

πŸ“š Documentation

  • Added comprehensive usage documentation with real-world examples
  • Created detailed API reference for all classes and methods
  • Documented common problems and their solutions
  • Added model architecture and selection guide

πŸ”§ Technical Details

  • PPU models now use red channel only for grayscale conversion
  • PPU models use 0-based dictionary indexing
  • Maximum width limited to 800px for PPU models
  • Safer array operations throughout the codebase
  • Enhanced preprocessing pipeline with model-specific normalization

πŸ“¦ Installation

```bash
npm install [email protected]
```

πŸš€ Quick Start

```typescript
import { createRapidOCREngine } from 'client-side-ocr';

const ocr = createRapidOCREngine({
language: 'en', // or any of 100+ languages
modelVersion: 'PP-OCRv4'
});

await ocr.initialize();
const result = await ocr.processImage(imageFile);
console.log(result.text);
```

πŸ™ Acknowledgments

Special thanks to the RapidOCR and PaddleOCR teams for their excellent models and to the ppu-paddle-ocr project for TypeScript implementation references.

πŸ“ Full Changelog

See CHANGELOG.md for detailed changes.

Release v1.3.0 - Table Detection & Layout Analysis

28 Jul 19:49

Choose a tag to compare

πŸŽ‰ Release v1.3.0 - Enhanced OCR Suite

✨ New Features

πŸ“Š Table Detection (RapidTable Integration)

  • PP-Structure models for English and Chinese table recognition
  • SLANet+ model for enhanced accuracy
  • HTML table output with cell detection
  • Automatic table structure extraction

πŸ“ Layout Analysis (RapidLayout Integration)

  • PP Layout CDLA model for document analysis
  • YOLOv8n Layout model for academic papers
  • DocLayout-YOLO for DocStructBench
  • Detects text, titles, tables, figures, and formulas

πŸ”§ Unified Model Registry

  • Centralized management of all OCR models
  • Local models from OnnxOCR directory
  • Local models from ppu-paddle-ocr directory
  • PPU Paddle OCR English mobile set as default
  • Easy model switching via Model Manager tab

🎨 Enhanced UI

  • New processing mode selector (OCR, Table, Layout, All-in-One)
  • Model Manager tab with GitHub links to model sources
  • Model source information display
  • Configurable model defaults
  • Processing history tracking

πŸ”„ Changes

  • Updated PP-OCRv5 model URLs to use master branch
  • Enhanced OCR interface with unified features
  • Improved model selection UI with source information
  • Better performance with local model loading

πŸš€ Performance

  • Local models loaded directly from disk for faster performance
  • Web Workers for parallel processing
  • Optimized model loading and caching

πŸ“¦ Installation

NPM Package:
```bash
npm install [email protected]
```

Live Demo:
https://siva-sub.github.io/client-ocr/

πŸ™ Acknowledgments

  • RapidAI team for RapidTable and RapidLayout
  • OnnxOCR for local ONNX models
  • ppu-paddle-ocr for optimized mobile models