Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit d7d947e

Browse files
[VPM] fix the parser to handle PDF files better: update requirements.txt
1 parent 92dfa5f commit d7d947e

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

requirements.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,16 @@
11
huggingface_hub
22
# LightRAG packages
33
lightrag-hku
4-
# MinerU 2.0 packages (replaces magic-pdf)
4+
# MinerU 2.0 packages (replaces magic-pdf) - handles PDF parsing with multiple backends
55
mineru[core]
66
# Progress bars for batch processing
77
tqdm
88
# Note: Optional dependencies are now defined in setup.py extras_require:
99
# - [image]: Pillow>=10.0.0 (for BMP, TIFF, GIF, WebP format conversion)
1010
# - [text]: reportlab>=4.0.0 (for TXT, MD to PDF conversion)
11-
# - [paddleocr]: paddleocr + pypdfium2 (for parser='paddleocr')
11+
# - [paddleocr]: paddleocr + pypdfium2 (for parser='paddleocr' - better OCR for scanned PDFs)
1212
# - [office]: requires LibreOffice (external program, not Python package)
1313
# - [all]: includes all optional dependencies
1414
#
1515
# Install with: pip install raganything[image,text] or pip install raganything[all]
16+
# For best PDF handling: pip install raganything[paddleocr]

0 commit comments

Comments
 (0)