Python modules implementing OCR-D specs and related tools
This repository contains the python packages that form the base for tools within the OCR-D ecosphere.
All packages are also published to PyPI.
NOTE Unless you want to contribute to OCR-D/core, we recommend installation as part of ocrd_all which installs a complete stack of OCR-D-related software.
The easiest way to install is via pip:
pip install ocrd
# or just the functionality you need, e.g.
pip install ocrd_modelfactoryAll python software released by OCR-D requires Python 3.6 or higher.
NOTE: All OCR-D CLI tools support a --help flag which shows usage and
supported flags, options and arguments.
A minimal OCR-D processor that copies from -I/-input-file-grp to -O/-output-file-grp
Contains utilities and constants, e.g. for logging, path normalization, coordinate calculation etc.
See README for ocrd_utils for further information.
Contains file format wrappers for PAGE-XML, METS, EXIF metadata etc.
See README for ocrd_models for further information.
Code to instantiate models from existing data.
See README for ocrd_modelfactory for further information.
Schemas and routines for validating BagIt, ocrd-tool.json, workspaces, METS, page, CLI parameters etc.
See README for ocrd_validators for further information.
Depends on all of the above, also contains decorators and classes for creating OCR-D processors and CLIs.
Also contains the command line tool ocrd.
See README for ocrd for further information.
Builds a bash script that can be sourced by other bash scripts to create OCRD-compliant CLI.
Raise an error and exit.
Delegate logging to ocrd log
Ensure minimum version
Output ocrd-tool.json.
Requires $OCRD_TOOL_JSON and $OCRD_TOOL_NAME to be set:
export OCRD_TOOL_JSON=/path/to/ocrd-tool.json
export OCRD_TOOL_NAME=ocrd-foo-barOutput file resource content.
Output file resources names.
Print usage
Expects an associative array ("hash"/"dict") ocrd__argv to be defined:
declare -A ocrd__argv=()usage: pageId=$(ocrd__input_file 3 pageId)
Download assets (make assets)
Test with local files: make test
-
Test with local asset server:
- Start asset-server:
make asset-server make test OCRD_BASEURL='http://localhost:5001/'
- Start asset-server:
-
Test with remote assets:
make test OCRD_BASEURL='https://github.com/OCR-D/assets/raw/master/data/'
- OCR-D Specifications (Repo)
- OCR-D core API documentation (built here via
make docs) - OCR-D Website (Repo)