pymupdf

Here are 238 public repositories matching this topic...

pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

python pdf font data-science ocr tesseract epub mupdf text-processing pdf-documents extract-data table-extraction text-shaping xps pymupdf

Updated Mar 2, 2026
Python

ArtifexSoftware / pdf2docx

Star

Open source Python library for converting PDF to DOCX.

pdf-converter docx pymupdf pdf-to-word extract-table

Updated Mar 2, 2026
Python

(eBook，PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.

pdf latex translation math ebook formulas pymupdf openai-api deepseek

Updated Sep 28, 2025
Python

Krasjet / pdf.tocgen

Sponsor

Star

A CLI toolset to generate table of contents for PDF files automatically.

cli pdf table-of-contents scraping toc-generator pdf-files pdf-document pymupdf

Updated Nov 26, 2023
Python

pymupdf / PyMuPDF-Utilities

Star

Demos, examples and utilities using PyMuPDF

python pdf ocr mupdf pymupdf

Updated Jan 8, 2026
Jupyter Notebook

lucasrla / remarks

Star

Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG

markdown pdf ocr highlighting annotations pdf-converter epub zotero obsidian ocrmypdf svg-images pymupdf remarkable-tablet roamresearch

Updated May 26, 2024
Python

genieincodebottle / parsemypdf

Star

Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

ocr openai claude camelot pymupdf pypdf ocr-python markitdown gemini-pro gemini-ai llama-parse omniai unstructured-io docling llama-vision mistral-ocr smoldocling llama4

Updated Aug 29, 2025
Python

vb64 / markdown-pdf

Star

Markdown to pdf renderer

markdown pdf markdown-it pymupdf plantuml-diagrams mermaid-diagrams

Updated Feb 18, 2026
Python

M1ck4 / pdfmd

Star

Smart PDF to Markdown converter with intelligent heading detection, automatic header/footer removal, orphan fragment merging, and image export. Features a user-friendly GUI with preview mode, persistent settings, and per-page error recovery. Optimized for Obsidian and other Markdown-based note-taking workflows.

python markdown open-source pdf ocr gui-application obsidian cli-tool pymupdf privacy-first pdf-to-markdown ofline-tool

Updated Dec 1, 2025
Python

Zain-Bin-Arshad / pdf-viewer

Star

A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.

python pdf pdf-viewer pure-python fitz pymupdf python-pdf pysimplegui pdf-viewer-python

Updated Jul 14, 2023
Python

devxzh / PDFTools

Star

基于pyqt5, pymupdf实现的批量添加目录书签，增强pdf，拆分合并pdf的小工具

pdf bookmark pyqt5 pdf-merge pdf-split pymupdf add-catalog

Updated Aug 5, 2021
Python

shayanalibhatti / Designing-a-PDF-Audiobook-using-Python

Star

In this code, a simple implementation of PDF to audio converter is shown

python python3 pdf-reader audio-converter gtts pytesseract pymupdf pdf-to-audio pdf-text pytesseract-ocr

Updated Mar 30, 2021
Python

seehiong / pdfusion

Star

A powerful PDF processing engine that deconstructs documents into their core elements—text, images, and tables—and seamlessly reconstructs them into pristine, structured Markdown. Built with a React frontend and a robust Python (PyMuPDF) backend on Appwrite.

react python markdown open-source pdf backend hackathon serverless-functions document-processing pymupdf appwrite

Updated Sep 10, 2025
Python

benitomartin / multimodal-llm-pymupdf4llm

Star

Multimodal RAG with PyMuPDF

python openai pymupdf qdrant llama-index

Updated Oct 4, 2024
Jupyter Notebook

TheWatcherMultiversal / pdfgui_tools

Star

pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.

linux pdf gnu-linux python3 pdf-document pypdf2 pymupdf qt6 pyside6 poppler-utils

Updated Feb 5, 2024
Python

xxao / pero

Star

Unified Python drawing API

visualization python svg drawing pyqt5 pyside2 wxpython pymupdf pycairo pyqt6 pyside6

Updated Sep 8, 2025
Python

DadaNanjesha / AI-content-detector-Humanizer

Sponsor

Star

A comprehensive web application that detects AI-generated content in PDF documents and transforms AI text into natural human-like writing. Built with Streamlit, spaCy, and Hugging Face transformers.

open-source python3 pymupdf spacy-nlp nltk-python huggingface-transformers transformers-models streamlit-application

Updated Dec 15, 2025
Python

pymupdf / PyMuPDF-Optional-Material

Star

Help file downloads, early ZIP binaries, wheels for retired Python 2.7, 3.5.

python windows pdf mupdf fitz pymupdf

Updated Apr 3, 2022

legout / veritascribe

Star

AI-Powered Thesis Review Tool

thesis master-thesis review-tools bachelor-thesis phd-thesis pymupdf dspy

Updated Aug 8, 2025
Python

NhanPhamThanh-IT / Scan-PDF-Paper

Star

Advanced document analysis platform that extracts text from PDF, DOCX, and TXT files with AI-powered topic classification using Sentence Transformers. Features keyword matching, real-time analysis, interactive Streamlit web interface, and multi-topic support.

python natural-language-processing deployment clean-code production web-application artificial-intelligence dataset releases mit-license oriented-object-programming oops-in-python pymupdf automation-testing streamlit streamlit-webapp streamlit-app streamlit-cloud

Updated Jul 21, 2025
Python

Improve this page

Add a description, image, and links to the pymupdf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pymupdf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pymupdf

Here are 238 public repositories matching this topic...

pymupdf / PyMuPDF

ArtifexSoftware / pdf2docx

CBIhalsen / PolyglotPDF

Krasjet / pdf.tocgen

pymupdf / PyMuPDF-Utilities

lucasrla / remarks

genieincodebottle / parsemypdf

vb64 / markdown-pdf

M1ck4 / pdfmd

Zain-Bin-Arshad / pdf-viewer

devxzh / PDFTools

shayanalibhatti / Designing-a-PDF-Audiobook-using-Python

seehiong / pdfusion

benitomartin / multimodal-llm-pymupdf4llm

TheWatcherMultiversal / pdfgui_tools

xxao / pero

DadaNanjesha / AI-content-detector-Humanizer

pymupdf / PyMuPDF-Optional-Material

legout / veritascribe

NhanPhamThanh-IT / Scan-PDF-Paper

Improve this page

Add this topic to your repo