MAINT,DOC: Clean up consolidated sorting code#31428
Conversation
|
(lint failure is unrelated, put a PR into main to fix it) |
MaanasArora
left a comment
There was a problem hiding this comment.
I looked briefly and I think this looks good! I only have one place for now where I wonder if it's unclear if the angle is better, comment inline...
And yes, StringDType should have used timsort - I probably introduced that oversight in a PR, sorry!
| Please use the `stable` parameter instead. 'quicksort' and 'heapsort' | ||
| are mapped to the default ``stable=False``, while 'mergesort' and | ||
| 'stable' are mapped to ``stable=True``. Fine grained algorithm control | ||
| has been removed. |
There was a problem hiding this comment.
The mapping idea makes sense under the hood but I wonder if it's better to be less explicit about it? Maybe something like (just a rough suggestion):
"Please use the stable parameter instead, as NumPy will choose a suitable algorithm for the given array. This argument is retained for backward compatibility. quicksort or heapsort are equivalent to stable=False, and mergesort or stable are equivalent to stable=True."
Edit: or maybe, "Please use the stable parameter instead. This argument is retained for backward compatibility and provides no additional control. quicksort or heapsort are equivalent to stable=False, and mergesort or stable are equivalent to stable=True."
There was a problem hiding this comment.
Adopted (with a while), nice suggestion, thanks!
A while ago, the sort code was consolidated to never really use mergesort (StringDType did, but I think this was just an oversight) or heapsort (except as part of the quicksort/introsort). This does a few cleanups: * delete mergesort.cpp (remaining users use timsort now, a fix) * make C++ only header `.hpp` and remove unnecessary `.h.src` * Use the helper to fetch `cmp` more generally in the sort code. * Moves some comments from the cpp to the hpp files as they fit better there I think. But I didn't try to polish them much. These are all rather straight forward. But the doc fixups do deserve a bit of a closer look!
|
Ping, there is almost nothing to this except a lot of cleanup and a bit of docs... |
|
Thanks Sebastian. Let's get this in. |
A while ago, the sort code was consolidated to never really use mergesort (StringDType did, but I think this was just an oversight) or heapsort (except as part of the quicksort/introsort).
This does a few cleanups:
.hppand remove unnecessary.h.srccmpmore generally in the sort code.These are all rather straight forward. But the doc fixups do deserve a bit of a closer look!
(I did use an agent a fair bit to help with some things.)
CC @charris maybe you can have a look, I changed the table and surrounding docs a fair bit and deleted the "speed" ranking (honestly, because I wasn't sure what to put for radixsort anyway)?
The code changes should be straight-forward (but code comments are moved). But I tried to clean out the Python sort algorithm documentation a bit, and I am not sure this is the best angle.
Mainly, I didn't think it is useful to still list "mergesort", etc. when we never use it...