Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PDF scientific paper translation and bilingual comparison - 完整保留排版的 PDF 文档全文双语翻译,支持 Google/Ollama 翻译

License

Notifications You must be signed in to change notification settings

liuxing9848/PDFMathTranslate

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

  • 📊 Retain formulas and charts.

  • 📄 Preserve table of contents.

  • 🌐 Support multiple translation services.

Installation

pip install pdf2zh

Usage

Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory.

Translate the entire document

pdf2zh example.pdf

Translate part of the document

pdf2zh example.pdf -p 1-3,5

Translate with the specified language

See Languages Codes.

pdf2zh example.pdf -li en -lo ja

Translate with Ollama

See Ollama.

pdf2zh example.pdf -s gemma2

Use regex to specify formula fonts and characters that need to be preserved

pdf2zh BDA3.pdf -f "(CM[^RT].*|MS.*|XY.*|MT.*|BL.*|.*0700|.*0500|.*Italic)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"

Preview

image

image

image

Acknowledgement

Document merging: PyMuPDF

Document parsing: Pdfminer.six

Document extraction: MinerU

Multi-threaded translation: MathTranslate

Layout parsing: DocLayout-YOLO

Star History

Star History Chart

About

PDF scientific paper translation and bilingual comparison - 完整保留排版的 PDF 文档全文双语翻译,支持 Google/Ollama 翻译

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%