perf(candid): linear-time encoding/decoding of large nat/int#744
Open
lwshang wants to merge 3 commits into
Open
perf(candid): linear-time encoding/decoding of large nat/int#744lwshang wants to merge 3 commits into
lwshang wants to merge 3 commits into
Conversation
Nat/Int encode and decode previously operated one 7-bit group at a time, shifting or OR-ing the whole bignum on every byte, which is O(n^2) in the encoded length. Rebuild the value in a single pass instead: - decode: collect the 7-bit groups and construct the bignum once via BigUint::from_radix_le(_, 128) (radix 128 is a power of two, so this bit-packs in O(n)); Int additionally reinterprets the two's-complement sign. - encode: emit groups via BigUint::to_radix_le(128) for Nat, and repack the minimal two's-complement bytes into SLEB128 groups for Int. The u64/i64 fast paths are unchanged. Add boundary, large-value, and randomized roundtrip tests for the bignum paths, plus nat_bignum/int_bignum canbench benchmarks. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Bump candid and candid_derive to 0.10.31, add the CHANGELOG entry, and refresh the workspace and bench lockfiles. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Click to see raw report |
There was a problem hiding this comment.
Pull request overview
Improves candid’s Nat/Int LEB128/SLEB128 bignum encoding and decoding to be linear-time in the encoded length by collecting base-128 groups and constructing the bignum once, rather than repeatedly shifting/OR-ing the entire bignum per byte.
Changes:
- Reworks
Nat/Intbignum encode/decode to use base-128 digit collection (to_radix_le(128)/from_radix_le(_, 128)) for O(n) behavior. - Adds boundary, large-value, and randomized roundtrip tests for the bignum paths.
- Adds large-nat/int benches and bumps crate versions + lockfiles + changelog entry for
0.10.31.
Reviewed changes
Copilot reviewed 6 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| rust/candid/src/types/number.rs | Replaces O(n²) bignum LEB128/SLEB128 encode/decode loops with O(n) group-collection + single-pass construction. |
| rust/candid/tests/number.rs | Adds new boundary/large/randomized roundtrip tests targeting the new bignum paths. |
| rust/bench/bench.rs | Adds nat_bignum / int_bignum benchmarks using ~1 MiB encoded bodies. |
| rust/candid/Cargo.toml | Bumps candid to 0.10.31 and syncs candid_derive dependency version. |
| rust/candid_derive/Cargo.toml | Bumps candid_derive to 0.10.31. |
| CHANGELOG.md | Adds a 0.10.31 changelog entry for linear-time large Nat/Int encode/decode. |
| Cargo.lock | Updates workspace lockfile for 0.10.31 version bumps. |
| rust/bench/Cargo.lock | Updates bench lockfile for 0.10.31 version bumps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Use `len - 1` directly instead of `saturating_sub(1)` so the helper returns exactly `len` bytes (and panics on the never-valid `len == 0`) rather than silently returning a 1-byte body. Addresses PR review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
|
✅ No security or compliance issues detected. Reviewed everything up to 772c058. Security Overview
Detected Code Changes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Nat/Intencode and decode processed values one LEB128/SLEB128 group at a time, shifting or OR-ing the whole bignum on every byte — O(n²) in the encoded length once a value exceeds theu64/i64fast path. This rebuilds the value in a single pass instead, making all four paths O(n).BigUint::from_radix_le(_, 128)(radix 128 is a power of two, so this bit-packs in O(n));Intadditionally reinterprets the two's-complement sign.BigUint::to_radix_le(128)forNat, and repack the minimal two's-complement bytes into SLEB128 groups forInt.The
u64/i64fast paths are unchanged.Impact
All four paths were O(n²) before this PR. Measured with
canbenchon a 1 MiB encoded value, after the fix:NatencodeNatdecodeIntencodeIntdecodeFor reference, before the fix the encode side alone measured 1.93 T (
Nat) and 3.85 T (Int) for the same input — a ~29,000× / ~13,500× reduction; decode had the same O(n²) shape.Tests
0x7f, large-negative, and a randomized roundtrip (1000 values, both signs) covering the bignum encode/decode paths.cargo test -p candid --all-features).Bench
nat_bignum/int_bignumto thecanbenchsuite (rust/bench).Release
candidandcandid_deriveto 0.10.31, adds the CHANGELOG entry, and refreshes the workspace and bench lockfiles.