A comprehensive OCR and document understanding library built in Rust with ONNX Runtime.
cargo add oar-ocrWith GPU support:
cargo add oar-ocr --features cudause oar_ocr::prelude::*;
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let ocr = OAROCRBuilder::new(
"pp-ocrv5_mobile_det.onnx",
"pp-ocrv5_mobile_rec.onnx",
"ppocrv5_dict.txt",
)
.build()?;
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(vec![image])?;
for text_region in &results[0].text_regions {
if let Some((text, confidence)) = text_region.text_with_confidence() {
println!("{} ({:.2})", text, confidence);
}
}
Ok(())
}use oar_ocr::oarocr::OARStructureBuilder;
let structure = OARStructureBuilder::new("pp-doclayout_plus-l.onnx")
.with_table_classification("pp-lcnet_x1_0_table_cls.onnx")
.with_table_structure_recognition("slanet_plus.onnx", "wireless")
.table_structure_dict_path("table_structure_dict_ch.txt")
.with_ocr("pp-ocrv5_mobile_det.onnx", "pp-ocrv5_mobile_rec.onnx", "ppocrv5_dict.txt")
.build()?;- Usage Guide - Detailed API usage, builder patterns, GPU configuration
- Pre-trained Models - Model download links and recommended configurations
cargo run --example ocr -- --help
cargo run --example structure -- --helpSee examples/ directory for complete CLI examples.
PaddleOCR-VL is a Vision-Language model for advanced document understanding. It supports element-level OCR and layout-first document parsing. Our implementation uses Candle for inference. Download the model first:
huggingface-cli download PaddlePaddle/PaddleOCR-VL --local-dir PaddleOCR-VL# Element-level OCR
cargo run --release --features paddleocr-vl,cuda --example paddleocr_vl -- --model-dir PaddleOCR-VL --task ocr document.jpg
# Table recognition (outputs HTML)
cargo run --release --features paddleocr-vl,cuda --example paddleocr_vl -- --model-dir PaddleOCR-VL --task table table.jpg
# Formula recognition (outputs LaTeX)
cargo run --release --features paddleocr-vl,cuda --example paddleocr_vl -- --model-dir PaddleOCR-VL --task formula formula.png
# Chart recognition
cargo run --release --features paddleocr-vl,cuda --example paddleocr_vl -- --model-dir PaddleOCR-VL --task chart chart.png
# Layout-first doc parsing (PP-DocLayoutV2 -> PaddleOCR-VL)
cargo run --release --features paddleocr-vl,cuda --example paddleocr_vl -- --model-dir PaddleOCR-VL --layout-model pp-doclayoutv2.onnx document.jpgThis project builds upon the excellent work of several open-source projects:
-
ort: Rust bindings for ONNX Runtime by pykeio. This crate provides the Rust interface to ONNX Runtime that powers the efficient inference engine in this OCR library.
-
PaddleOCR: Baidu's awesome multilingual OCR toolkits based on PaddlePaddle. This project utilizes PaddleOCR's pre-trained models, which provide excellent accuracy and performance for text detection and recognition across multiple languages.
-
Candle: A minimalist ML framework for Rust by Hugging Face. We use Candle to implement Vision-Language model inference.