fx-cmix

The fx-cmix is a updated implementation of fast-cmix.

Prize awarded on February 2, 2024. http://prize.hutter1.net/

Submission Description

This submission contains fallowing modifications on top of the recent fast-cmix Hutter Prize winner:

paq8hp model is replaced with fxcmv1 model with fallowing notable additions:
- Multiple state tables are used in predictors, this allows better predictability.
- Most contexts are divided between 30 main predictors, this allows more efficient memory usage per context.
- Added bracket, quote, first char, char in paragraph, column, table, template, word stream/paragraph context. These contexts are parsed depending on input, this includes parsing of wiki links, http links, tables, columns, paragraphs, quotes, brackets, list, templates.
- Some contexts are swapped in predictors depending on what current input is (table, column mode, word/paragraph, list).
- Some predictors are switched on/off depending on last char, link or current bracket which improves compression.
- Predictions are mixed with context that are more aware of what predictors are outputting.
- Match model (not present in paq8hp model).
- Predictors are faster allowing more complex contexts.
new dictionary
small change in phda9 preprocessor and in two tables in cmix
memory usage is larger in fxcmv1 model compared to old paq8hp

Below is the fx-cmix result:

Metric	Value
fx-cmix compressor's executable file size (S1)	436707 bytes
fx-cmix self-extracting archive size (S2)	112148343 bytes
Total size (S)	112585050 bytes
Previous record (L)	114156155 bytes
fx-cmix improvement (1 - S/L)	1.38%

Experiment platform
Operating system	Ubuntu 20.04
Processor	Intel(R) Xeon(R) CPU @ 2.20GHz Geekbench score 706
Memory	32 GB
Decompression running time	85 hours
Decompression RAM max usage	9575124 KiB
Decompression disk usage	~35GB

Time, disk, and RAM usage are approximately symmetric for compression and decompression.

Instructions

The installation and usage instructions for fx-cmix are the same as for fast-cmix.

One important note: it is recommended to change one variable in the source code for PPM. From line 26 in src/models/ppmd.cpp:

// If mmap_to_disk is set to false (recommended setting), PPM will only use RAM
// for memory.
// If mmap_to_disk is set to true, PPM memory will be saved to disk using mmap.
// This will reduce RAM usage, but will be slower as well. *Warning*: this will
// write a *lot* of data to disk, so can reduce the lifespan of SSDs. Not
// recommended for normal usage.
bool mmap_to_disk = true;

This variable is set to true by default, to comply with the Hutter Prize RAM limit.

Installing packages required for compiling fx-cmix compressor from sources on Ubuntu

Building fx-cmix compressor from sources requires clang-17, upx-ucl, and make packages. On Ubuntu, these packages can be installed by running the following scripts:

./install_tools/install_upx.sh
./install_tools/install_clang-17.sh

Compiling fx-cmix compressor from sources

A bash script is provided for compiling cmix-hp compressor from sources on Ubuntu. This script places the cmix-hp executable file named as cmix in ./run directory. The script can be run as

./build_and_construct_comp.sh

Running fx-cmix compressor

To run the cmix-hp compressor use

cd ./run
cmix -e <PATH_TO_ENWIK9> enwik9.comp

Running fx-cmix decompressor

The compressor is expected to output an executable file named archive9 in the same directory (./run). The file archive9 when executed is expected to reproduce the original enwik9 as a file named enwik9_restored. The executable file archive9 should be launched without argments from the directory containing it.

cd ./run
./archive9

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dictionary		dictionary
install_tools		install_tools
prof_input		prof_input
src		src
.gitignore		.gitignore
COPYING		COPYING
LICENSE		LICENSE
README.md		README.md
build_and_construct_comp.sh		build_and_construct_comp.sh
construct_from_prebuilt.sh		construct_from_prebuilt.sh
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

fx-cmix

Submission Description

Instructions

Installing packages required for compiling fx-cmix compressor from sources on Ubuntu

Compiling fx-cmix compressor from sources

Running fx-cmix compressor

Running fx-cmix decompressor

About

Licenses found

Uh oh!

Releases 1

Packages

Languages

License

Licenses found

kaitz/fx-cmix

Folders and files

Latest commit

History

Repository files navigation

fx-cmix

Submission Description

Instructions

Installing packages required for compiling fx-cmix compressor from sources on Ubuntu

Compiling fx-cmix compressor from sources

Running fx-cmix compressor

Running fx-cmix decompressor

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages