-
Notifications
You must be signed in to change notification settings - Fork 1
Modules Explained
Skyler Ruiter edited this page Jun 25, 2025
·
1 revision
The part that makes FZMod meaningfully different from cuSZ/pSZ is the focus on modular design and ease of use.
There is a need for further exploration of how GPU accelerated lossy compression can benefit different use cases, and this library aims to put more power in the hands of experts to assess on their own how their data and use case might benefit from different available compression pipelines.
Modules are self-contained high-performance data reduction codes that can be combined to form a compression pipeline. The current implementation of the codebase uses a similar structure to SZ3 which has four main stages:
- (1) Pre-processing
- Generally scanning the data and updating the error bound for a relative error bounded compression (instead of absolute error bounds)
- (2) Decomposition
- Either from the set of Lorenzo predictors or spline 3d predictor, which quantizes the error between the prediction and true decompressed value. This is generally the only lossy step, as this bounds the error.
- (3) Encoder
- The encoders take the decomposed data and compress them further losslessly, either with a version of Huffman encoding or a novel FZGPU encoder from Boyuan Zhang.
- The Huffman encoder also includes a pre-processing histogram stage to count the quantization codes
- (4) Lossless
- This is an optional last step executed on the CPU to take the encoded data and run it through another lossless encoder such as ZSTD or GZIP. Often this step is not very meaningful after Huffman encoding.
Existing pipeline ideas are shown in the CLI wiki page and fzmod CLI help printout. We generally see three main pipelines:
- General (Lorenzo -> Histogram (Sparse) -> Huffman)
- Data-Quality (Spline3D -> Histogram -> Huffman)
- Fast (Lorenzo -> FZG)
- Step 1: write the high-perforamance data reduction module in
modules/and update CMakeLists.txt to include and build it- this can be done either by linking directly to a buildable codebase (ex. ZSTD) or writing a kernel (ex. Sparse Histogram) and including it in the fzmod cmake target.
- It is recommended to use similar programming patterns to the existing codebase for ease of development.
- Step 2: Update
fzmod_compressor.hhandfzmod_decompressor.hhwith needed logic branches- generally means creating a method for your module (ex.
huffman()) and adding an additional branch to an existing if statement onto that stage of the compressor and calling your module. Similar steps can be followed for the decompressor implementation. If needed internal buffers should be declared and used frombuffer.hh
- generally means creating a method for your module (ex.
- Step 3: Update the
header.hhfile andconfig.hh(and optionallycli.cc) to support the new module.- Simply add simlar logic to the existing code for that type of data reduction module
- Step 4: Test module and add relevent documentation and supporting code.