Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Modules Explained

Skyler Ruiter edited this page Jun 25, 2025 · 1 revision

Synopsis

The part that makes FZMod meaningfully different from cuSZ/pSZ is the focus on modular design and ease of use.

There is a need for further exploration of how GPU accelerated lossy compression can benefit different use cases, and this library aims to put more power in the hands of experts to assess on their own how their data and use case might benefit from different available compression pipelines.


What are Modules?

Modules are self-contained high-performance data reduction codes that can be combined to form a compression pipeline. The current implementation of the codebase uses a similar structure to SZ3 which has four main stages:

  • (1) Pre-processing
    • Generally scanning the data and updating the error bound for a relative error bounded compression (instead of absolute error bounds)
  • (2) Decomposition
    • Either from the set of Lorenzo predictors or spline 3d predictor, which quantizes the error between the prediction and true decompressed value. This is generally the only lossy step, as this bounds the error.
  • (3) Encoder
    • The encoders take the decomposed data and compress them further losslessly, either with a version of Huffman encoding or a novel FZGPU encoder from Boyuan Zhang.
    • The Huffman encoder also includes a pre-processing histogram stage to count the quantization codes
  • (4) Lossless
    • This is an optional last step executed on the CPU to take the encoded data and run it through another lossless encoder such as ZSTD or GZIP. Often this step is not very meaningful after Huffman encoding.

Existing pipeline ideas are shown in the CLI wiki page and fzmod CLI help printout. We generally see three main pipelines:

  • General (Lorenzo -> Histogram (Sparse) -> Huffman)
  • Data-Quality (Spline3D -> Histogram -> Huffman)
  • Fast (Lorenzo -> FZG)

Adding Future Modules

  • Step 1: write the high-perforamance data reduction module in modules/ and update CMakeLists.txt to include and build it
    • this can be done either by linking directly to a buildable codebase (ex. ZSTD) or writing a kernel (ex. Sparse Histogram) and including it in the fzmod cmake target.
    • It is recommended to use similar programming patterns to the existing codebase for ease of development.
  • Step 2: Update fzmod_compressor.hh and fzmod_decompressor.hh with needed logic branches
    • generally means creating a method for your module (ex. huffman()) and adding an additional branch to an existing if statement onto that stage of the compressor and calling your module. Similar steps can be followed for the decompressor implementation. If needed internal buffers should be declared and used from buffer.hh
  • Step 3: Update the header.hh file and config.hh (and optionally cli.cc) to support the new module.
    • Simply add simlar logic to the existing code for that type of data reduction module
  • Step 4: Test module and add relevent documentation and supporting code.

Clone this wiki locally