Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

nhz2
Copy link
Member

@nhz2 nhz2 commented Aug 28, 2025

This is a way to allow TranscodingStreams.Codec objects to be used with the decode, decode!, try_decode!, and try_resize_decode! functions from ChunkCodecCore https://github.com/JuliaIO/ChunkCodecs.jl.

Fixes #247 #132 #105

CC: @Moelf @eschnett @mkitti

@nhz2
Copy link
Member Author

nhz2 commented Aug 29, 2025

Benchmark script

using BenchmarkTools
using Random
using TranscodingStreams
using CodecZstd
using ChunkCodecCore

const SUITE = BenchmarkGroup()
cbencht = SUITE["compression"] = BenchmarkGroup()
dbencht = SUITE["decompression"] = BenchmarkGroup()
cbenchd = SUITE["compression-decode"] = BenchmarkGroup()
dbenchd = SUITE["decompression-decode"] = BenchmarkGroup()
dbenchc = SUITE["decompression-size-hint"] = BenchmarkGroup()

ccodec = ZstdCompressor()
dcodec = ZstdDecompressor()
TranscodingStreams.initialize(ccodec)
TranscodingStreams.initialize(dcodec)

for N in [100, 100000, 1000000]
    u1 = rand(Xoshiro(1234), UInt8, N)
    c1 = transcode(ZstdCompressor, u1)

    u2 = rand(Xoshiro(1234), 0x00:0x01, N)
    c2 = transcode(ZstdCompressor, u2)

    f = round.(randn(Xoshiro(1234), N); base=2, digits=7)
    # byte shuffle
    u3 = vec(permutedims(reinterpret(reshape, UInt8, f),(2,1)))
    c3 = transcode(ZstdCompressor, u3)
    for (data, compressed, name) in [
        (u1, c1, "incompressible"),
        (u2, c2, "compressible-bytes"),
        (u3, c3, "byteshuffle"),
    ]
        cbencht[name][N] = @benchmarkable transcode($ccodec, $data)
        cbenchd[name][N] = @benchmarkable decode($ccodec, $data)
        dbencht[name][N] = @benchmarkable transcode($dcodec, $compressed)
        dbenchd[name][N] = @benchmarkable decode($dcodec, $compressed)
        dbenchc[name][N] = @benchmarkable decode($dcodec, $compressed; size_hint=$(length(data)))
    end
end

CodecZstd Benchmark Results

This is with CodecZstd v0.8.6 and ChunkCodecCore v0.6.0

Compression Benchmarks

Data Type Size transcode decode
incompressible 100 491.974 ns 482.872 ns
incompressible 100000 11.752 μs 11.662 μs
incompressible 1000000 108.356 μs 109.308 μs
compressible-bytes 100 717.993 ns 693.305 ns
compressible-bytes 100000 130.579 μs 130.910 μs
compressible-bytes 1000000 1.368 ms 1.369 ms
byteshuffle 100 3.867 μs 3.877 μs
byteshuffle 100000 635.483 μs 632.938 μs
byteshuffle 1000000 7.036 ms 7.029 ms

Decompression Benchmarks

Data Type Size transcode decode decode (size hint)
incompressible 100 90.559 ns 89.657 ns 97.310 ns
incompressible 100000 2.130 μs 2.131 μs 3.162 μs
incompressible 1000000 25.098 μs 25.058 μs 37.391 μs
compressible-bytes 100 168.579 ns 170.996 ns 179.060 ns
compressible-bytes 100000 59.554 μs 59.634 μs 58.091 μs
compressible-bytes 1000000 633.559 μs 632.176 μs 619.482 μs
byteshuffle 100 1.886 μs 1.849 μs 1.790 μs
byteshuffle 100000 208.187 μs 208.668 μs 196.686 μs
byteshuffle 1000000 2.077 ms 2.087 ms 1.865 ms

@nhz2 nhz2 marked this pull request as ready for review September 11, 2025 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow generic DenseVector{UInt8} as input/output
3 participants