difpy2

A super-fast, in-memory duplicate & similar image finder built on perceptual-hash bucketing and Numba-accelerated comparison.

Features

Zero on-disk output: everything runs in RAM
Exact & “similar” mode (custom MSE threshold)
Perceptual-hash + histogram pre-bucketing to prune comparisons
Numba-JIT mean-squared-error with early bailout
Thread-pooled image loading & feature extraction
CLI and Python API

Installation

pip install difpy2
Requires Python ≥ 3.12

Quickstart
CLI
bash
Copy
Edit
difpy2 \
  -D /path/to/images \
  --px_size 50 \
  --bins 8 \
  --sim 0.0     # exact duplicates only; use >0 for “similar” mode
Options

-D, --dirs … one or more image directories

-r, --recursive … recurse into subfolders

-px, --px_size … resize images to px×px for comparison

-b, --bins … per-channel histogram buckets

-s, --sim … MSE threshold (0.0 = exact only)

-t, --threads … number of worker threads

Python API
python
Copy
Edit
from difpy2 import DuplicateFinder

finder = DuplicateFinder(
    directories=["/path/to/images"],
    px_size=50,
    hist_bins=8,
    similarity=0.0,    # exact duplicates
    threads=4,
)

results, lower_quality, stats = finder.run()

# results: { primary_image_path: [[duplicate_path, mse], …], … }
# lower_quality: [all duplicate/similar image paths]
# stats: { total_files, featurized, groups, duration_s }
Project Layout
arduino
Copy
Edit
difpy2/
├── difpy_opt.py         # core implementation
├── README.md
├── LICENSE.txt
├── pyproject.toml
└── …
Contributing
Fork the repo

Create a feature branch

Run tests & linters

Submit PR

License
This project is licensed under the MIT License. See LICENSE.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

difpy2

Features

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Ryandonofrio3/difpy2

Folders and files

Latest commit

History

Repository files navigation

difpy2

Features

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages