Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
12d8f73
Ignore the .idea directory from PyCharm.
tdoan2010 May 17, 2022
88e0320
Add FastAPI and Uvicorn to the requirement list.
tdoan2010 May 17, 2022
2e907b0
Implement the Hello World version of the processing server.
tdoan2010 May 17, 2022
1d0d997
Only pass the "parameter", not everything from kwargs to the object i…
tdoan2010 May 17, 2022
663ee86
Remove init logging. Do it in the init.py of the decorators instead.
tdoan2010 May 17, 2022
54a0108
Return processor information instead of hello world.
tdoan2010 May 18, 2022
53ee1c9
Move server.py to ocrd package. Add a mechanism to detect if server i…
tdoan2010 May 18, 2022
ed3161e
Add some comments and typing.
tdoan2010 May 18, 2022
a89c65b
Add --server-ip and --server-port option. Pass metadata to the Swagge…
tdoan2010 May 18, 2022
ead5dd6
Add help docs for --server-ip and --server-port.
tdoan2010 May 18, 2022
a452035
Initialize the processor with proper parameters.
tdoan2010 May 18, 2022
20daeff
Add the option ocrd server.
tdoan2010 May 19, 2022
bcbd0ff
Move init logging out of the server app.
tdoan2010 May 19, 2022
f222831
Add the --tool option.
tdoan2010 May 19, 2022
2581afe
Add the Pydantic model for ocrd-tool.
tdoan2010 May 19, 2022
4f28485
Restructure the code.
tdoan2010 May 25, 2022
529c721
Add ORM package for MongoDB.
tdoan2010 May 25, 2022
83ad27e
First attempt with MongoDB.
tdoan2010 May 25, 2022
0459dcc
Restructure the code by using the Config class.
tdoan2010 May 25, 2022
053e746
Use different models for input and database.
tdoan2010 May 25, 2022
718ea39
Merge branch 'master' of github.com:tdoan2010/ocrd-core into web-api
tdoan2010 May 25, 2022
36e4ba0
Return 404 when job not found.
tdoan2010 May 25, 2022
89a3724
Load the processor and validate the parameters.
tdoan2010 May 25, 2022
6a0b96e
Run the processor on request.
tdoan2010 Jun 17, 2022
b65fe40
Fix circular import.
tdoan2010 Jun 17, 2022
6102cb0
Set workspace before running the processor.
tdoan2010 Jun 20, 2022
d3ce31d
Fix file not found from the workspace.
tdoan2010 Jun 21, 2022
7c3deee
Change the input/output file groups to array type. Add the METS outpu…
tdoan2010 Jun 27, 2022
80eb9f0
Make sure that the code works with older Python versions.
tdoan2010 Jul 7, 2022
ace8785
Move the help string down. Get proper logger.
tdoan2010 Jul 7, 2022
4611943
Refactor the code to avoid using global variables in cross module com…
tdoan2010 Jul 8, 2022
10dd14f
Refactor the code to avoid using global variables in cross module com…
tdoan2010 Jul 8, 2022
5f22392
Add the first server test.
tdoan2010 Jul 12, 2022
737b712
Fix startup patching. Add error messages.
tdoan2010 Jul 12, 2022
2e32ea3
Add more information to the test ocrd-tool.
tdoan2010 Jul 12, 2022
de74f57
Restructure the test cases.
tdoan2010 Jul 12, 2022
ee5d63c
Lower the requirement to fix the CircleCI
tdoan2010 Jul 13, 2022
beb524d
Lower the requirement to fix the CircleCI
tdoan2010 Jul 13, 2022
a8785db
Make output file group optional.
tdoan2010 Jul 13, 2022
f8d8cc1
Make output file group optional.
tdoan2010 Jul 13, 2022
1a2c15d
Add pytest-mock
tdoan2010 Jul 13, 2022
59f97e9
Add the test for the POST processor endpoint.
tdoan2010 Jul 13, 2022
64482d6
Remove type-hint to make it works with Python 3.6
tdoan2010 Jul 13, 2022
5537b33
Fix patch module for older version of Beanie.
tdoan2010 Jul 13, 2022
62f8236
Change double quote to single quote.
tdoan2010 Jul 20, 2022
0a4ef01
Remove unnecessary parenthesis.
tdoan2010 Jul 20, 2022
20253fc
Restructure the code.
tdoan2010 Jul 20, 2022
739429c
Restructure the test.
tdoan2010 Jul 20, 2022
124b802
Merge branch 'master' of github.com:tdoan2010/ocrd-core into web-api
tdoan2010 Jul 20, 2022
d8fd5ed
Add tests for the get_job endpoint.
tdoan2010 Jul 20, 2022
2d6e344
Reduce options into --server=ip:port:mongo_url
tdoan2010 Sep 30, 2022
4485916
Merge remote-tracking branch 'origin/master' into web-api
tdoan2010 Oct 4, 2022
b9ea819
Fix unit tests with proper patch.
tdoan2010 Oct 4, 2022
e17031e
Read ocrd_tool and version from the stdout.
tdoan2010 Oct 4, 2022
25bb031
Fix late import. Remove unused import.
tdoan2010 Oct 4, 2022
7707b54
Fix parameters default value from an empty dict to None.
tdoan2010 Oct 4, 2022
1db550c
Re-use the run_processor method. Use frozendict for caching. Update t…
tdoan2010 Oct 10, 2022
6c5a095
Fix type assertion.
tdoan2010 Oct 10, 2022
9d5c6af
Fix failed test case.
tdoan2010 Oct 10, 2022
833d834
Add start_time and end_time to the job description.
tdoan2010 Oct 10, 2022
29590df
Change the command name to processing-server and parameter name to --…
tdoan2010 Oct 12, 2022
e621cb4
Fix mets_url when run from CLI.
tdoan2010 Nov 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,5 @@ sanders*
ws1
*.doctree
.vscode

.idea/
2 changes: 2 additions & 0 deletions ocrd/ocrd/cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ def get_help(self, ctx):
from ocrd.decorators import ocrd_loglevel
from .zip import zip_cli
from .log import log_cli
from .server import server_cli

@click.group()
@click.version_option()
Expand All @@ -48,3 +49,4 @@ def cli(**kwargs): # pylint: disable=unused-argument
cli.add_command(validate_cli)
cli.add_command(log_cli)
cli.add_command(resmgr_cli)
cli.add_command(server_cli)
48 changes: 48 additions & 0 deletions ocrd/ocrd/cli/server.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
"""
OCR-D CLI: start the processing server

.. click:: ocrd.cli.server:server_cli
:prog: ocrd server
:nested: full

"""
from subprocess import run, PIPE

import click
import uvicorn

from ocrd.helpers import parse_server_input, parse_version_string
from ocrd.server.main import ProcessorAPI
from ocrd_utils import parse_json_string_with_comments, initLogging


@click.command('processing-server')
@click.argument('processor_name', required=True, type=click.STRING)
@click.option('--address',
help='Host name/IP, port, and connection string to a Mongo DB in the format IP:PORT:MONGO_URL',
required=True,
type=click.STRING)
def server_cli(processor_name, address):
try:
ip, port, mongo_url = parse_server_input(address)
except ValueError:
raise click.UsageError('The --server option must have the format IP:PORT:MONGO_URL')

ocrd_tool = parse_json_string_with_comments(
run([processor_name, '--dump-json'], stdout=PIPE, check=True, universal_newlines=True).stdout
)
version = parse_version_string(
run([processor_name, '--version'], stdout=PIPE, check=True, universal_newlines=True).stdout
)

initLogging()

# Start the server
app = ProcessorAPI(
title=ocrd_tool['executable'],
description=ocrd_tool['description'],
version=version,
ocrd_tool=ocrd_tool,
db_url=mongo_url
)
uvicorn.run(app, host=ip, port=port, access_log=False)
92 changes: 64 additions & 28 deletions ocrd/ocrd/decorators/__init__.py
Original file line number Diff line number Diff line change
@@ -1,40 +1,42 @@
from os.path import isfile
import sys
from contextlib import redirect_stdout
from io import StringIO
from typing import Type

import click
import uvicorn

from ocrd.server.main import ProcessorAPI
from ocrd_utils import getLogger, initLogging
from ocrd_utils import (
is_local_filename,
get_local_filename,
set_json_key_value_overrides,
set_json_key_value_overrides, parse_json_string_with_comments,
)

from ocrd_utils import getLogger, initLogging
from ocrd_validators import WorkspaceValidator
from ocrd.decorators.loglevel_option import ocrd_loglevel
from ocrd.decorators.mets_find_options import mets_find_options
from ocrd.decorators.ocrd_cli_options import ocrd_cli_options
from ocrd.decorators.parameter_option import parameter_option, parameter_override_option
from ocrd.helpers import parse_server_input, parse_version_string
from ocrd.processor.base import run_processor, Processor
from ocrd.resolver import Resolver

from ..resolver import Resolver
from ..processor.base import run_processor

from .loglevel_option import ocrd_loglevel
from .parameter_option import parameter_option, parameter_override_option
from .ocrd_cli_options import ocrd_cli_options
from .mets_find_options import mets_find_options

def ocrd_cli_wrap_processor(
processorClass,
ocrd_tool=None,
mets=None,
working_dir=None,
dump_json=False,
dump_module_dir=False,
help=False, # pylint: disable=redefined-builtin
profile=False,
profile_file=None,
version=False,
overwrite=False,
show_resource=None,
list_resources=False,
**kwargs
processorClass: Type[Processor],
ocrd_tool=None,
mets=None,
working_dir=None,
address=None,
dump_json=False,
dump_module_dir=False,
help=False, # pylint: disable=redefined-builtin
profile=False,
profile_file=None,
version=False,
overwrite=False,
show_resource=None,
list_resources=False,
**kwargs
):
if not sys.argv[1:]:
processorClass(workspace=None, show_help=True)
Expand All @@ -50,6 +52,37 @@ def ocrd_cli_wrap_processor(
list_resources=list_resources
)
sys.exit()
if address:
try:
ip, port, mongo_url = parse_server_input(address)
except ValueError:
raise click.UsageError('The --server option must have the format IP:PORT:MONGO_URL')

initLogging()

# Read the ocrd_tool object
f1 = StringIO()
with redirect_stdout(f1):
processorClass(workspace=None, dump_json=True)
ocrd_tool = parse_json_string_with_comments(f1.getvalue())

# Read the version string
f2 = StringIO()
with redirect_stdout(f2):
processorClass(workspace=None, show_version=True)
version = parse_version_string(f2.getvalue())

# Start the server
app = ProcessorAPI(
title=ocrd_tool['executable'],
description=ocrd_tool['description'],
version=version,
ocrd_tool=ocrd_tool,
db_url=mongo_url,
processor_class=processorClass
)

uvicorn.run(app, host=ip, port=port, access_log=False)
else:
initLogging()
LOG = getLogger('ocrd_cli_wrap_processor')
Expand Down Expand Up @@ -86,7 +119,8 @@ def ocrd_cli_wrap_processor(
# XXX While https://github.com/OCR-D/core/issues/505 is open, set 'overwrite_mode' globally on the workspace
if overwrite:
workspace.overwrite_mode = True
report = WorkspaceValidator.check_file_grp(workspace, kwargs['input_file_grp'], '' if overwrite else kwargs['output_file_grp'], page_id)
report = WorkspaceValidator.check_file_grp(workspace, kwargs['input_file_grp'],
'' if overwrite else kwargs['output_file_grp'], page_id)
if not report.is_valid:
raise Exception("Invalid input/output file grps:\n\t%s" % '\n\t'.join(report.errors))
if profile or profile_file:
Expand All @@ -97,6 +131,7 @@ def ocrd_cli_wrap_processor(
print("Profiling...")
pr = cProfile.Profile()
pr.enable()

def exit():
pr.disable()
print("Profiling completed")
Expand All @@ -106,5 +141,6 @@ def exit():
s = io.StringIO()
pstats.Stats(pr, stream=s).sort_stats("cumulative").print_stats()
print(s.getvalue())

atexit.register(exit)
run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
2 changes: 2 additions & 0 deletions ocrd/ocrd/decorators/ocrd_cli_options.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import click
from click import option
from .parameter_option import parameter_option, parameter_override_option
from .loglevel_option import loglevel_option
Expand Down Expand Up @@ -26,6 +27,7 @@ def cli(mets_url):
option('-O', '--output-file-grp', help='File group(s) used as output.', default='OUTPUT'),
option('-g', '--page-id', help="ID(s) of the pages to process"),
option('--overwrite', help="Overwrite the output file group or a page range (--page-id)", is_flag=True, default=False),
option('--address', help='Host name/IP, port, and connection string to a Mongo DB.', type=click.STRING),
option('-C', '--show-resource', help='Dump the content of processor resource RESNAME', metavar='RESNAME'),
option('-L', '--list-resources', is_flag=True, default=False, help='List names of processor resources'),
parameter_option,
Expand Down
37 changes: 37 additions & 0 deletions ocrd/ocrd/helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from typing import Tuple


def parse_server_input(input_str: str) -> Tuple[str, int, str]:
"""
Parse the string into 3 parts, IP address, port, and Mongo database connection string.

Args:
input_str (str): a string with the format ``ip:port:db``, where ``ip`` and ``port`` is where the server listens
on, and ``db`` is a connection string to a Mongo database.

Returns:
str, int, str: the IP, port, and Mongo DB connection string respectively.
"""
elements = input_str.split(':', 2)
if len(elements) != 3:
raise ValueError
ip = elements[0]
port = int(elements[1])
mongo_url = elements[2]

return ip, port, mongo_url


def parse_version_string(version_str: str) -> str:
"""
Get the version number from the output of the :py:function:`~ocrd.processor.base.Processor.show_version`.

Args:
version_str (str): A string which looks like this ``Version %s, ocrd/core %s``

Returns:
str: the string between the word ``Version`` and the first comma
"""
first_split = version_str.split(',')
second_split = first_split[0].split(' ')
return second_split[1]
1 change: 1 addition & 0 deletions ocrd/ocrd/processor/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
)
from .helpers import (
run_cli,
run_cli_from_api,
run_processor,
generate_processor_help
)
4 changes: 2 additions & 2 deletions ocrd/ocrd/processor/builtin/ocrd-tool.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
"description": "Bare-bones processor that copies file from input group to output group",
"steps": ["preprocessing/optimization"],
"categories": ["Image preprocessing"],
"input_file_grp": "DUMMY_INPUT",
"output_file_grp": "DUMMY_OUTPUT"
"input_file_grp": ["DUMMY_INPUT"],
"output_file_grp": ["DUMMY_OUTPUT"]
}
Loading