This repository is a fork of pdfparanoia. The code was rewritten to use pdfminer.six, Python 3 and a modern build system.
pdfparanoia is a PDF watermark removal library for academic papers. Some publishers include private information, like institution names, personal names, IP addresses, timestamps, and other identifying information, in watermarks on each page.
git clone https://github.com/snake-4/pdfparanoia.git
cd pdfparanoia
pip install .import pdfparanoia
with open("nmat91417.pdf", "rb") as fin:
with open("output.pdf", "wb") as fout:
fout.write(pdfparanoia.scrub(fin.read()))pdfparanoia --verbose input.pdf -o output.pdf- AIP
- IEEE
- JSTOR
- RSC