Releases: OCR-D/core
Releases · OCR-D/core
v2.68.0
Changed:
- ocrd_network: Use
ocrd-all-tool.jsonbundled by core instead of download from website, #1257, #1260 - 🔥
ocrd network client processing processorrenamedocrd network client processing run, #1269 ocrd network client processing runsupports blocking behavior with--blockby polling job status, #1265, #1269
Added:
ocrd network client workflow runRun, optionally blocking, a workflow on the processing server, #1265, #1269ocrd network client workflow check-statusto get the status of a workflow job, #1269ocrd network client processing check-statusto get the status of a processing (processor) job, #1269ocrd network client discovery processorsto list the processors deployed in the processing server, #1269ocrd network client discovery processorto get theocrd-tool.jsonof a deployed processor, #1269ocrd network client processing check-logto retrieve the log data for a processing job, #1269- Environment variables
OCRD_NETWORK_CLIENT_POLLING_SLEEPandOCRD_NETWORK_CLIENT_POLLING_TIMEOUTto control polling interval and timeout forocrd network client {processing processor,workflow run, #1269 ocrd workspace clone/Resolver.workspace_from_url: withclobber_mets=False, raise a FileExistsError for existing mets.xml on disk, #563, #1268ocrd workspace find --download: print the the correct, up-to-date field, notNone, #1202, #1266
Fixed:
- Sanitize
self.imageFilenamefor thepcGtsIdto ensure it is a validxml:id, #1271
v3.0.0a2
Changed:
- 🔥
OcrdPageas proxy ofPcGtsTypeinstead of alias; also containsetreeandmappingnow - 🔥
Processor.zip_input_filesnow can throwocrd.NonUniqueInputFileandocrd.MissingInputFile
(the latter only ifOCRD_MISSING_INPUT=ABORT) - 🔥
Processor.zip_input_filesdoes not by default userequire_firstanymore
(so the first file in any input file tuple per page can beNoneas well) - 🔥 no more
Workspace.overwrite_mode, merely delegate toOCRD_EXISTING_OUTPUT=OVERWRITE - 🎨 improve on docs result for
ocrd_utils.config
Added:
- 👉
OCRD_DOWNLOAD_INPUTfor whether input files should be downloaded before processing - 👉
OCRD_MISSING_INPUTfor how to handle missing input files (SKIPorABORT) - 👉
OCRD_MISSING_OUTPUTfor how to handle processing failures (SKIPorABORTorCOPY)
the latter behaves like ocrd-dummy for the failed page(s) - 👉
OCRD_EXISTING_OUTPUTfor how to handle existing output files (SKIPorABORTorOVERWRITE) - new CLI option
--debugas short-hand forABORTchoices above Processor.loggerset up by constructor already (for re-use by processor implementors)default-expand and validateocrd_tool.jsoninProcessorconstructor, log invalidities- handle JSON
deprecationinocrd_tool.jsonby reporting warnings
v3.0.0a1
#1240 for details
Changed:
- 🔥 Deprecate
Processor.process - update spec to v3.25.0, which requires annotating fileGrp cardinality in
ocrd-tool.json - 🔥 Remove passing non-processing kwargs to
Processorconstructor, add as members..
(i.e.show_help,dump_json,dump_module_dir,list_resources,show_resource,resolve_resource) - 🔥 Deprecate passing processing arg / kwargs to
Processorconstructor..
(i.e.workspace,page_id,input_file_grp,output_file_grp; now all set byrun_processor) - 🔥 Deprecate passing
ocrd-tool.jsonmetadata toProcessorconstructor ocrd.processor: Handle loading of bundledocrd-tool.jsongenerically
Added:
Processor.process_workspace: process a complete workspace, with default implementationProcessor.process_page_file: process an OcrdFile, with default implementationProcessor.process_page_pcgts: process a single OcrdPage, produce a single OcrdPage, required to implementProcessor.verify: handle fileGrp cardinality verification, with default implementationProcessor.setup: to set up processor before processing, optional
v2.67.1
v2.67.0
Changed:
- Additional docker base images with preinstalled tensorflow 1 (
core-cuda-tf1), tensorflow 2 (core-cuda-tf2) and torch (core-cuda-torch), #1239 - Resource Manager: Skip instead of raise an exception download if target file already exists (unless
--overwrite), #1246 - Resource Manager: Try to use bundled
ocrd-all-tool.jsonif available, #1250, OCR-D/all#444
Added:
ocrd processdoes support-U/--mets-server, #1243
Fixed:
ocrd process-derived tasks are not run in a temporary directory when not called from within workspace, #1243- regression from #1238 where processors failed that had required parameters, #1255, #1256
- METS Server: Unlink UDS sockert file if it exists before startup, #1244
- Resource Manager: Do not create zero-size files for failing downloads, #1201, #1246
- Workspace.add_file: Allow multiple processors to create file group folders simultaneously, #1203, #1253
- Resource Manager: Do not try to run
--dump-jsonfor known non-processorsocrd-{cis-data,import,make}, #1218, #1249 - Resource Manager: Properly handle copying of directories, #1237, #1248
- bashlib: regression in parsing JSON from introducing parameter preset files, #1258
Removed:
v2.66.1
v2.66.0
Fixed:
OcrdFile.urlcan now be removed properly, #1226, #1227ocrd workspace find --undo-download: Only remove file refs if it's an actual download, #1150, #1235ocrd workspace find --undo-download: When--keep-filesis not set, remove file from disk, #1150, #1235OCRD_LOGGING_DEBUG: Normalize/lowercase boolean values, #1230, #1231Workspace.download_file: UseOcrd.local_filenameif set but not already present in the FS, #1149, #1228
Changed:
- Install ocrd with
pip --editableinside Docker, #1225, OCR-D/ocrd_all#416 - Reduce log spam in ocrd_network, #1222
- CI: Stop testing for 3.7, #1207, #1221
Added:
- Separate docker versions for tensorflow v1, tensorflow v2 and torch, #1186
- Processing server can serve as a proxy for METS Server TCP requests, forwarding to UDS, #1220
ocrd workspace cleanto remove "untracked", i.e. not METS-referenced, files, #1150, #1236-pnow supports parameter preset resources in addition to raw JSON and absolute/relative paths to JSON files, #930, #969, #1238