Thanks to visit codestin.com
Credit goes to github.com

Skip to content

perf/sorter: use custom binary encoding for binary-collated keys#4895

Open
jussisaurio wants to merge 1 commit intomainfrom
sorter-use-binary-encoding
Open

perf/sorter: use custom binary encoding for binary-collated keys#4895
jussisaurio wants to merge 1 commit intomainfrom
sorter-use-binary-encoding

Conversation

@jussisaurio
Copy link
Collaborator

@jussisaurio jussisaurio commented Jan 27, 2026

we can get 2x performance for binary-collated in-memory sorter workloads by encoding the sort keys into binary, so sorting becomes a simple lexicographic byte comparison. keys are constructed one time (O(n)) and stored in bump arena

implementation fully written by Codex 5.2. interestingly enough I was originally attempting to implement the sorting algorithm from this expired oracle patent but accidentally discovered during benchmarking that using binary-encoded keys for binary-collation was at least twice as fast as our current ValueRef::cmp based implementation, so decided to pivot to that.

Note: this implementation diverges from ValueRef in that ValueRef actually panics on NaN since it assumes total order - this implementation sorts `NaN´ first without panicing.

Sorter sort-key vs std/std-cmp/10000/p0
                        time:   [668.43 µs 669.41 µs 670.44 µs]
                        thrpt:  [14.916 Melem/s 14.939 Melem/s 14.960 Melem/s]
                 change:
                        time:   [-4.6808% -3.9162% -3.2394%] (p = 0.00 < 0.05)
                        thrpt:  [+3.3478% +4.0758% +4.9107%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
Sorter sort-key vs std/std-key/10000/p0
                        time:   [319.43 µs 321.37 µs 323.64 µs]
                        thrpt:  [30.899 Melem/s 31.117 Melem/s 31.306 Melem/s]
                 change:
                        time:   [-3.4728% -2.5003% -1.5040%] (p = 0.00 < 0.05)
                        thrpt:  [+1.5269% +2.5645% +3.5978%]
                        Performance has improved.

@blacksmith-sh

This comment has been minimized.

@codspeed-hq
Copy link

codspeed-hq bot commented Jan 27, 2026

Merging this PR will degrade performance by 32.44%

⚡ 8 improved benchmarks
❌ 1 regressed benchmark
✅ 371 untouched benchmarks
🆕 12 new benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation numeric_div_integers 395 ns 307.5 ns +28.46%
🆕 Simulation std-cmp[10000/k1] N/A 7.3 ms N/A
🆕 Simulation std-cmp[100000/k1] N/A 96 ms N/A
Simulation numeric_from_float_value 421.7 ns 334.2 ns +26.18%
🆕 Simulation std-cmp[10000/k4] N/A 7.4 ms N/A
🆕 Simulation std-cmp[100000/k4] N/A 101.6 ms N/A
🆕 Simulation std-cmp[10000/k2] N/A 7.3 ms N/A
🆕 Simulation std-cmp[100000/k2] N/A 100.4 ms N/A
🆕 Simulation std-key[10000/k4] N/A 3.2 ms N/A
Simulation numeric_neg_integer 182.2 ns 269.7 ns -32.44%
Simulation numeric_from_integer_value 420.8 ns 304.2 ns +38.36%
🆕 Simulation std-key[100000/k4] N/A 40.9 ms N/A
🆕 Simulation std-key[10000/k1] N/A 3.5 ms N/A
🆕 Simulation std-key[100000/k2] N/A 45.1 ms N/A
🆕 Simulation std-key[10000/k2] N/A 3.4 ms N/A
🆕 Simulation std-key[100000/k1] N/A 46.4 ms N/A
Simulation add_integers 622.5 ns 447.5 ns +39.11%
Simulation multiply_integers 651.7 ns 505.8 ns +28.83%
Simulation exec_if_not 365 ns 277.5 ns +31.53%
Simulation divide_integers 713.1 ns 538.1 ns +32.52%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing sorter-use-binary-encoding (954ddd3) with main (fb40573)

Open in CodSpeed

@jussisaurio jussisaurio force-pushed the sorter-use-binary-encoding branch 5 times, most recently from 94389c5 to 954ddd3 Compare January 27, 2026 14:05
@jussisaurio jussisaurio marked this pull request as ready for review January 27, 2026 14:23
Copy link

@turso-bot turso-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review @LeMikaelF

@jussisaurio
Copy link
Collaborator Author

I'll do a self-review pass of this at some pt and then merge

@jussisaurio jussisaurio force-pushed the sorter-use-binary-encoding branch from 954ddd3 to a5eb97a Compare February 16, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants