-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Summary
Replace the single usage of rust-timsort with a more performant sorting algorithm. Either Rust's default sorting algorithm or a crate like glidesort which has positive benchmarks against the default implementation.
Detailed Explanation
As of CPython 3.11, Timsort is no longer the default sorting algorithm. It is instead an algorithm called Powersort.
rust-timsort
RustPython currently uses a forked project https://github.com/RustPython/rust-timsort as its default sorting algorithm. To quote rust-timsort's README.
This is still an extreme work-in-progress, and performance has vast room for improvement.
This performance gap has become noticeable as new unit tests in later Python versions are introduced, which attempt to sort larger and larger lists. Sorting 1 million random numbers with rust-timsort takes 10-20 minutes vs 0.3 seconds for CPython.
RustPython on statistics-module-kde-function [$!?] via :snake: v3.13.1 via :crab: v1.88.0
✦ ❯ time python -c "from random import random; sorted([random() for i in range(1_000_000)]); print('DONE');"
DONE
real 0m0.309s
user 0m0.274s
sys 0m0.036s
RustPython on statistics-module-kde-function [$!?] via :snake: v3.13.1 via :crab: v1.88.0
✦ ❯ time cargo run --release -- -c "from random import random; sorted([random() for i in range(1_000_000)]); print('DONE');"
Finished `release` profile [optimized] target(s) in 0.16s
Running `target/release/rustpython -c 'from random import random; sorted([random() for i in range(1_000_000)]); print('\''DONE'\'');'`
DONE
real 16m52.217s
user 16m51.926s
sys 0m0.174s
glidesort
Glidesort is a crate which contains a sorting algorithm Glidesort which is an apparent enhancement to Powersort. Unlike rust-timsort, Glidesort has posted benchmark numbers showing it comparing as good or significantly better (in the case of largely presorted lists) than Rust's builtin sorting method which is a Mergesort implementation.

Drawbacks, Rationale, and Alternatives
Rationale
CPython and PyPy both have left Timsort behind as better performing alternatives became available i.e. Powersort. I believe the usage of Timsort in any Python is an implementation detail and if better performing sorting algorithms are available, they should be fair game.
If we choose to go with the default Rust sort then an entire dependency is removed from RustPython and maintaining the sorting algorithm becomes a non-issue until such time as performance warrants using something different.
Drawbacks
- A drawback with Glidesort is it does not seem to contain any tests and is not actively maintained. This would probably need forking like rust-timsort
Unresolved Questions
- Is Rust's builtin sorting algorithm quick enough to be happy to drop dependencies on fancier algorithms for easier maintainability?
- If not, is Glidesort not having tests and needing to be forked a blocker and should some other crates be considered?