Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix: Handle exactly-sized buffers in compress_into/decompress_into#165

Merged
robert3005 merged 2 commits into
spiraldb:developfrom
paradedb:stuhood.buffer-length-ub
Mar 17, 2026
Merged

fix: Handle exactly-sized buffers in compress_into/decompress_into#165
robert3005 merged 2 commits into
spiraldb:developfrom
paradedb:stuhood.buffer-length-ub

Conversation

@stuhood

@stuhood stuhood commented Feb 22, 2026

Copy link
Copy Markdown
Contributor

This PR addresses two buffer boundary bugs within the compress_into and decompress_into APIs.

  • decompress_into - Utilized a fallback loop (while out_end.offset_from(out_ptr) > 8) to decode trailing bytes. If a caller provided exactly the uncompressed string's length as the target capacity, the loop would terminate prematurely if there were < 8 bytes of remaining capacity. This resulted in an assertion 'left == right' failed: decompression should exhaust input before output panic.
  • compress_into - Iterated over the input relying on while out_ptr < out_end. However, compress_word can emit an ESCAPE_CODE followed by a literal byte, which advances out_ptr by 2. If out_ptr was at out_end - 1, the loop condition evaluated to true, but the second byte write would overwrite unowned memory past the allocation boundary.

Added a test for exactly-sized buffers in tests/exact_capacity.rs.

@stuhood

stuhood commented Feb 22, 2026

Copy link
Copy Markdown
Contributor Author

One thing to call out here: an alternative approach would be for decompress_into to document/assert-for a strict requirement for decoded.len() >= decompressed_len + 7, and compress_into to require values.capacity() >= (plaintext.len() * 2) + 16. In that case, the branchless fast paths could be preserved.

stuhood added a commit to paradedb/tantivy that referenced this pull request Feb 23, 2026
stuhood added a commit to paradedb/paradedb that referenced this pull request Feb 23, 2026
stuhood added a commit to paradedb/paradedb that referenced this pull request Feb 23, 2026
@a10y

a10y commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

an alternative approach would be for decompress_into to document/assert-for a strict requirement for decoded.len() >= decompressed_len + 7, and compress_into to require values.capacity() >= (plaintext.len() * 2) + 16

This is documented already in decompress_into:

    /// Decompress a slice of codes into a provided buffer.
    ///
    /// The provided `decoded` buffer must be at least the size of the decoded data, plus
    /// an additional 7 bytes.

We just don't have a great way to enforce that before we start decoding, hence the assertion. If we no longer require that, we should update the doc comment. Have you run any of the decoding benchmarks to see if the decode_into change has a performance impact?

@codspeed-hq

codspeed-hq Bot commented Feb 25, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 20 untouched benchmarks
⏩ 20 skipped benchmarks1


Comparing paradedb:stuhood.buffer-length-ub (4220a6d) with develop (7ad818b)

Open in CodSpeed

Footnotes

  1. 20 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@a10y

a10y commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Seems like CodSpeed is convinced :D

@a10y a10y left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just fix the lints and gtg!

@a10y a10y mentioned this pull request Feb 26, 2026
@robert3005

Copy link
Copy Markdown
Member

@stuhood would you mind fixing the lint so we can get this merged?

@robert3005 robert3005 enabled auto-merge (squash) March 17, 2026 22:53
@robert3005 robert3005 merged commit 340b1ef into spiraldb:develop Mar 17, 2026
5 checks passed
robert3005 pushed a commit that referenced this pull request Mar 17, 2026
## 🤖 New release

* `fsst-rs`: 0.5.6 -> 0.5.7 (✓ API compatible changes)

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

## [0.5.7](v0.5.6...v0.5.7) -
2026-03-17

### Fixed

- Handle exactly-sized buffers in `compress_into`/`decompress_into`
([#165](#165))

### Other

- no more duplicate candidate generation
([#181](#181))
- *(deps)* lock file maintenance
([#180](#180))
- *(deps)* update swatinem/rust-cache digest to e18b497
([#179](#179))
- *(deps)* lock file maintenance
([#178](#178))
- *(deps)* lock file maintenance
([#176](#176))
- Remove codspeed walltime benchmark
([#177](#177))
- Add more micro benchmarks
([#171](#171))
- *(deps)* update marcoieni/release-plz-action digest to 1528104
([#170](#170))
- *(deps)* update codspeedhq/action digest to 281164b
([#169](#169))
- *(deps)* update actions/upload-artifact action to v7
([#167](#167))
- *(deps)* lock file maintenance
([#168](#168))
- *(deps)* update actions/upload-artifact action to v6
([#160](#160))
- *(deps)* lock file maintenance
([#164](#164))
- *(deps)* update swatinem/rust-cache digest to 779680d
([#157](#157))
- *(deps)* update actions/checkout digest to de0fac2
([#158](#158))
- *(deps)* update codspeedhq/action digest to 2ac5728
([#162](#162))
- *(deps)* update marcoieni/release-plz-action digest to f708778
([#166](#166))
- *(deps)* update marcoieni/release-plz-action digest to 52440b5
([#156](#156))
- *(deps)* lock file maintenance
([#161](#161))
- *(deps)* lock file maintenance
([#159](#159))
- *(deps)* update actions/checkout action to v6
([#154](#154))
- *(deps)* lock file maintenance
([#155](#155))
- *(deps)* update codspeedhq/action digest to 346a2d8
([#152](#152))
- *(deps)* update actions/checkout digest to 93cb6ef
([#151](#151))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants