zip-sizer is a command-line tool that estimates the compressed size of large directories. It works by sampling a fraction of the data to efficiently calculate the compression ratio.
- very
memory efficientandfast - supports estimates for the
gzipandbzip2algorithms - estimate for different compression levels (1-9)
- Accuracy is about +/- 2.5% in my testing, but will obviously depend on type of files, size of the archive and sampling fraction. (Tested by comparing with
tar -cf - <directory> | gzip -9 | wc -c)
To estimate the compressed size of the ~/Downloads directory using gzip with a compression level set to 5 and sampling 10% of the data, with human-readable size reporting:
> bin/zip-sizer -l 5 -a gzip -r 0.1 --human-readable ~/Downloads
# Output
Total original size: 3.73 GB
Estimated compressed size: 3.46 GB
# It is fast enough to be useful
> time bin/zip-sizer -l 5 -a gzip -r 0.1 ~/Downloads
# Output
Total original size: 4003741457 bytes
Estimated compressed size: 3711794335 bytes
bin/zip-sizer -l 5 -a gzip -r 0.1 ~/Downloads 10.30s user 0.34s system 106% cpu 9.952 total
Download a binary for your operating system from the release page
git clone https://github.com/arunsupe/zip-sizer.git
cd zip-sizer
go build -o bin/zip-sizer zip-sizer.goRun the program with the following command-line options:
./bin/zip-sizer [options] <directory><directory>: The directory to estimate the compressed size of.
-l, --compression-level: Compression level (1-9). Default: 9.
-a, --compression-algorithm: Compression algorithm (gzip or bzip2). Default: gzip.
-r, --sample-ratio: Sample ratio for compression estimation (e.g., 0.1 for 10%). Default: 0.1.
-u, --human-readable: Display sizes in human-readable format.
-v, --verbose: Show what is happening under the hood
The program provides the following output:
Total original size of the files in bytes.
Estimated compressed size in bytes.
Total original size: 104857600 bytes
Estimated compressed size: 52428800 bytesThis project uses the following Go libraries:
github.com/alexflint/go-arg for argument parsing.
github.com/dsnet/compress for bzip2 compression.
This project is licensed under the MIT License. See the LICENSE file for details.
Created by arunsupe.