English | 简体中文 |
A High-Performance Multilingual OCR Engine Based on ONNX
- 2025.12.29
- 服务层重构为FastAPI,支持ASGI高并发架构
- 保持v1接口100%兼容,新增v2多文件处理接口
- 新增健康检查、监控日志、并发控制等生产级功能
- 支持多种输出格式:JSON、文本、TSV、hOCR
- 2025.05.21
- Added PP-OCRv5 model, supporting 5 language types in a single model: Simplified Chinese, Traditional Chinese, Chinese Pinyin, English, and Japanese.
- Overall recognition accuracy improved by 13% compared to PP-OCRv4.
- Accuracy is consistent with PaddleOCR 3.0.
- Deep Learning Framework-Free: A universal OCR engine ready for direct deployment.
- Cross-Architecture Support: Uses PaddleOCR-converted ONNX models, rebuilt for deployment on both ARM and x86 architecture computers with unchanged accuracy under limited computing power.
- High-Performance Inference: Faster inference speed on computers with the same performance.
- Multilingual Support: Single model supports 5 language types: Simplified Chinese, Traditional Chinese, Chinese Pinyin, English, and Japanese.
- Model Accuracy: Consistent with PaddleOCR models.
- Domestic Hardware Adaptation: Restructured code architecture for easy adaptation to more domestic GPUs by modifying only the inference engine.
python>=3.7
# 安装FastAPI版本依赖
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements-fastapi.txt python>=3.6
# 安装原版依赖
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt Note:
- The Mobile version model is used by default; the PP-OCRv5_Server-ONNX model offers better performance.
- The Mobile model is already in
onnxocr/models/ppocrv5and requires no download; - The PP-OCRv5_Server-ONNX model is large and uploaded to Baidu Netdisk (extraction code: wu8t). After downloading, place the
detandrecmodels in./models/ppocrv5/to replace the existing ones.
python test_ocr.py # Linux/Mac
./start_fastapi.sh
# Windows
start_fastapi.bat
# 或者手动启动
gunicorn app.main:app -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:5005 --workers 4curl -X POST http://localhost:5005/ocr \
-H "Content-Type: application/json" \
-d '{"image": "base64_encoded_image_data"}' # 单文件上传
curl -X POST http://localhost:5005/api/v2/ocr \
-F "file=@test_image.jpg" \
-F "model_name=PP-OCRv5" \
-F "conf_threshold=0.6" \
-F "output_format=json" \
-F "bbox=true"
# 多文件上传
curl -X POST http://localhost:5005/api/v2/ocr \
-F "[email protected]" \
-F "[email protected]" \
-F "output_format=text"curl http://localhost:5005/health # 基本健康检查
curl http://localhost:5005/api/v2/readyz # 模型就绪检查python app-service.py curl -X POST http://localhost:5005/ocr \
-H "Content-Type: application/json" \
-d '{"image": "base64_encoded_image_data"}' {
"processing_time": 0.456,
"results": [
{
"text": "Name",
"confidence": 0.9999361634254456,
"bounding_box": [[4.0, 8.0], [31.0, 8.0], [31.0, 24.0], [4.0, 24.0]]
},
{
"text": "Header",
"confidence": 0.9998759031295776,
"bounding_box": [[233.0, 7.0], [258.0, 7.0], [258.0, 23.0], [233.0, 23.0]]
}
]
} docker build -t onnxocr-fastapi . # CPU版本(默认)
docker-compose up -d
# GPU版本(自动检测,失败时回退CPU)
docker-compose -f docker-compose.gpu.yml up -d
# 基础运行
docker run -itd --name onnxocr-service -p 5005:5005 onnxocr-fastapidocker run -itd --name onnxocr-service -p 5005:5005 \
-e WORKERS=4 \
-e THREADS=2 \
-e LOG_LEVEL=INFO \
-e DEFAULT_MODEL=PP-OCRv5 \
-e MODEL_CONCURRENCY=8 \
-e USE_GPU=true \
-e MAX_UPLOAD_MB=50 \
onnxocr-fastapi- NVIDIA Docker Runtime
- CUDA兼容GPU
- onnxruntime-gpu==1.14.1 (已包含在requirements中)
- 自动检测GPU可用性,失败时自动回退CPU推理
# 使用原版Dockerfile
docker build -f Dockerfile.flask -t ocr-service . docker run -itd --name onnxocr-service-v3 -p 5006:5005 onnxocr-service:v3 url: ip:5006/ocr
{
"processing_time": 0.456,
"results": [
{
"text": "Name",
"confidence": 0.9999361634254456,
"bounding_box": [[4.0, 8.0], [31.0, 8.0], [31.0, 24.0], [4.0, 24.0]]
},
{
"text": "Header",
"confidence": 0.9998759031295776,
"bounding_box": [[233.0, 7.0], [258.0, 7.0], [258.0, 23.0], [233.0, 23.0]]
}
]
} | Example 1 | Example 2 |
|---|---|
| Example 3 | Example 4 |
|---|---|
| Example 5 | Example 6 |
|---|---|
I am currently seeking job opportunities. Welcome to connect!
Thanks to PaddleOCR for technical support!
I am passionate about open source and AI technology, believing they can bring convenience and help to those in need, making the world a better place. If you recognize this project, you can support it via Alipay or WeChat Pay (please note "Support OnnxOCR" in the remarks).
Welcome to submit Issues and Pull Requests to improve the project together!