Using doclytics to update the "Content" of a document

Paperless-ngx uses an OCR engine that is not particularly good with languages like chinese, korean and especially seems to perform badly when multiple languages are present in the same document.

Multiple language in the same document is extremely common in Hong Kong.

Could doclytics be a bridge to apply LLMs to do the OCR instead of the built in (or overwrite)?
For example the new model available called minicpm-v is capable of OCR in multiple languages

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using doclytics to update the "Content" of a document #97

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Using doclytics to update the "Content" of a document #97

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions