-
Notifications
You must be signed in to change notification settings - Fork 33
add builtin processors ocrd-command and ocrd-merge #1343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
use cases for
|
kba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! For out-of-the-box UX and testing, would it make sense to also bundle (some? all?) of the page processing scripts you developed for ocrd-command?
Yes. I'll add
|
|
Is this sufficient documentation in your opinion, @kba? (Perhaps the actual workflow recipes should go into the workflow guide – there is only so much you can do here without mentioning/depending on other tools and actual problems...) |
Yes, excellent, many thanks. I think the list of presets are a good starting point, they are documented and bundled. Of course, the workflow guide sorely needs an update and would benefit from this but should not be prereq for this PR. Merging and releasing now. |
use-cases for
ocrd-commandare pretty much every tool that can do things with PAGE e.g.(Building on tools from https://github.com/kba/transkribus-to-prima. The
sedcommand just ensures that segment identifiers are valid XML ids, as is not always the case in Transkribus.)(Building on tools from https://github.com/bertsky/workflow-configuration. The first adds
@orientationto pages by measuring average slope of annotated lines. The second slices up the ReadingOrder into UnorderedGroups at every@headerregion it encounters.)(Building on https://github.com/PRImA-Research-Lab/prima-page-converter, which can convert between PAGE namespace versions.)