-
Sajon GmbH
- Bern, Switzerland
- sajon.net
- https://orcid.org/0009-0004-2126-9129
- https://unibe-ch.academia.edu/JonasH%C3%A4ssig
Lists (1)
Sort Name ascending (A-Z)
Stars
Toolkit for linearizing PDFs for LLM datasets/training
Get your documents ready for gen AI
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
The headless rich text editor framework for web artisans.
Open-source technology for creating full-stack knowledge applications for communities of all types.
ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.
A complete alternative for Overleaf with VSCode + Web + Git Integration + Copilot + Grammar & Spell Checker + Live Collaboration Support. Based on GitHub Codespace and Dev container.
A hatch plugin to help build Jupyter packages
Allow you to access your calibre libraries and read books directly in Obsidian.
Convert PDF to markdown + JSON quickly with high accuracy
Convert Word documents (.docx files) to HTML
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Python tool for converting files and office documents to Markdown.
Mdformat plugin for MyST compatibility
Interpreter for interactive educational content, written in an extended Markdown format...
Parse PDFs into markdown using Vision LLMs
sddai / markerPDF
Forked from datalab-to/markerConvert PDF to markdown quickly with high accuracy
Repository for the book Among Digitized Manuscripts by L.W. Cornelis van Lit (Leiden: Brill, 2020)
The Project Gutenberg tool to generate EPUBs and other ebook formats.
📚 Freely available programming books
A machine learning software for extracting information from scholarly documents