A custom OCR python program for Maxima
- Download
MaximaOCR-Setup-<version>.exefrom https://github.com/automato-ai/maxima-ocr/releases. - Copy it to the factory PC (the install target is airgapped — the installer is fully self-contained).
- Right-click → Run as administrator. Accept the default install location
C:\Maxima\OCR. - The installer registers
MaximaOCRas a Windows service (auto-start) and starts it. Verify withsc query MaximaOCRor by triggering an OCR call.
Run a newer MaximaOCR-Setup-<version>.exe on the same PC. The installer:
- detects the prior install via its AppId,
- stops the
MaximaOCRservice, - replaces
modbus_server.exe,models\,tools\,nssm.exe, and the scripts, - preserves the existing
config.yaml(site-specific tuning survives), - re-registers and restarts the service.
The install directory ships with operator scripts. Run them as administrator (right-click → Run as administrator, or from an elevated shell):
| Script | Effect |
|---|---|
start.bat |
sc start MaximaOCR |
stop.bat |
sc stop MaximaOCR |
restart.bat |
stop, wait for STOPPED, start |
register-service.bat |
re-register service if it's broken/missing |
unregister-service.bat |
stop + remove service (uninstaller uses this) |
uninstall.bat |
run the full uninstaller (accepts /VERYSILENT, etc.) |
Edit C:\Maxima\OCR\config.yaml and run restart.bat. Edits survive upgrades.
NSSM redirects the service's stdout/stderr to C:\Maxima\OCR\service-stdout.log
and service-stderr.log (rotated at 1 MB). Application-level logs (DEBUG) are
in modbus_server.log per the logging: section of config.yaml.
The build is self-bootstrapping — no system-installed Inno Setup or NSSM needed.
Prerequisites on the build machine:
- Python with the project venv set up (
pip install -r requirements.txt). gh(GitHub CLI), authenticated to an account with read access to the privatedennispo/maxima-ocr-learningrepo. The build fetches the latestbundle-YYYYMMDDrelease from there and bakes it into the installer. In CI:ghis preinstalled on Windows runners; just setGH_TOKEN(orGITHUB_TOKEN) to a PAT with read access to that repo.- Internet access on first build (downloads NSSM + Inno Setup into
build\tools\, cached after; models bundle is re-checked every build but the download is skipped if the latest tag is already present locally).
Build:
compile.bat 0.4
The first arg is the version baked into the installer filename and Add/Remove
Programs entry. If omitted, compile.bat falls back to git describe --tags
so a tagged checkout still builds locally. In CI, pass the git tag explicitly:
compile.bat %GITHUB_REF_NAME%
compile.bat wraps build.ps1, which:
- resolves the latest
bundle-*release from the models repo and downloads it intomodels/if not already present (then prunes stale dated subfolders), - compiles
modbus_server.exeandtools\modbus_client.exewith PyInstaller, - fetches and caches NSSM and Inno Setup if not already in
build\tools\, - compiles
dist\MaximaOCR-Setup-<version>.exevia ISCC.
For offline / iteration-on-installer builds:
compile.bat 0.4 -SkipPyInstaller -SkipModelFetch
This program is a modbus TCP server (modbus slave) and can be controlled via TCP modbus communication.
The default listening port is 502, but can be changed in the config.yaml file.
The protocol on top of modbus id as follows:
- Master triggers an operation by writing an opcode to the holding register at address
1. The slave will handle this register as a button. Meaning, it will only be triggered by the change of the value from0to a non-zero opcode. Any consecutive writing of the same opcode will be ignored (assuming the button is still pressed). The status will reset, once the Master writes0to the register again. See the opcodes below. - The operation starts. It can take a few seconds. The execution status can be monitored on holding register
2. The monitoring is optional. See the status codes below. - Once the operation is complete, the value at register
1will become0, indicating that no operation is currently performed and the slave is ready for next operation. - The result of the last executed operation can be read from the holding registers starting at address
3. The result string is encoded inasciiformat, one char per register. The string is terminated by0(next register after the result string is promised to hold the value0).
Value of holding register at address 1 |
Operation |
|---|---|
0 |
Idle / release button |
1 |
Capture video from all cameras to disk |
2 |
Run OCR; result string is the recognized cylinder digits |
For opcode 2 (OCR), the recognition outcome determines both the status register
and the result string:
| Outcome | Status register | Result string |
|---|---|---|
| Recognized | 0 (Complete) |
recognized digits |
| No cameras found | 10 (Error) |
cameras not found |
| Cameras not ready | 10 (Error) |
cameras not ready |
| Unrecognized | 10 (Error) |
unrecognized |
Every OCR trigger (opcode 2) leaves a replay artifact in capture.folder
(default ./capture):
| File | Contents |
|---|---|
{YYYYMMDD-HHMMSS}-{cam}.mp4 |
one video per camera, same naming as opcode 1 |
{YYYYMMDD-HHMMSS}-meta.json |
session header + per-frame model decisions + outcome |
The metadata JSON pairs 1:1 with the videos via the shared timestamp prefix.
Each frame entry carries, per camera, the readiness decision (or {"disabled": true} when readiness gating is off), the bounding box, the OCR text +
per-digit detections, and the aggregator's running result (including the
individual digit candidates and a reference to the per-frame original and
preprocessed crop PNGs stored under {prefix}-crops/). The tick-level
agg_status and fused fields capture the cross-camera fusion verdict — the
same data the pipeline uses to decide whether to terminate.
Replay a session with tools\replay.exe:
C:\Maxima\OCR\tools\replay.exe C:\Maxima\OCR\capture\20260511-153012-meta.json
The tool shows each camera's video side-by-side with the bbox (cyan, with the bottom edge highlighted in red), a readiness chip, and below each video a zoomed view of the original crop with aggregator candidates marked and the preprocessed crop with OCR digit boxes. Pass a folder instead of a meta.json to replay the most recent session.
Recording is enabled by default. To disable on disk-constrained sites:
ocr:
capture:
enabled: falseOptionally, the master can monitor the execution of the operation by reading the execution status from
holding register at address 2 during the operation. The possible values are:
Value of holding register at address 2 |
Execution status |
|---|---|
0 |
Complete |
1 |
Working |
10 |
Complete with error |