bpo-45530: speed listobject.c's unsafe_tuple_compare() #29076

tim-one · 2021-10-19T23:41:39Z

bpo-45530: speed listobject.c's unsafe_tuple_compare()

https://bugs.python.org/issue45530

sweeneyde · 2021-10-20T01:09:04Z

Is it okay that this changes observable behavior? In particular,

>>> class X:
...     def __init__(self, label):
...         self.label = label
...     def __eq__(self, other):
...         print(self.label, "==", other.label)
...         return True
...     def __lt__(self, other):
...         print(self.label, "<", other.label)
...         return True
...     def __repr__(self):
...         return self.label
... 
...     
>>> L = [X("A"), X("B"), X("C"), X("D")]
>>> sorted(L)
B < A
C < B
D < C
[D, C, B, A]

############## Before ##############
>>> sorted([(a,) for a in L])
B == A
C == B
D == C
[(A,), (B,), (C,), (D,)]

############## After ##############
>>> sorted([(a,) for a in L])
B < A
C < B
D < C
[(D,), (C,), (B,), (A,)]

If we didn't want this change, then there could be stricter checks in the pre-sort scan so that unsafe_tuple_compare only gets used if tuple_elem_compare is known to be safe.

tim-one · 2021-10-20T01:22:06Z

Is it okay that this changes observable behavior?

I think it's fine, just not for a bugfix release. Python defines very little about its sorting algorithm, and effectively doesn't really define anything about the specific example you gave, since class X doesn't define a total ordering (doesn't, e.g., satisfy trichotomy, and for any a and b, a < b and b < a are both True). In "garbage in, garbage out" cases, we don't promise to keep the same garbage out.

If you have a class that defines a "for real" total ordering, then the result is defined, including that pairs comparing equal must retain their original order.

rhettinger · 2021-10-20T03:10:27Z

When the first elements are equal, which is faster, the two calls to tuple_elem_compare() or the one call to Py_RichCompareBool(Py_EQ)?

tim-one · 2021-10-20T03:38:06Z

When the first elements are equal, which is faster, the two calls to tuple_elem_compare() or the one call to Py_RichCompareBool(Py_EQ)?

Can't answer without knowing the specific function tuple_elem_compare resolves to. For very simple types (like floats, ints that fit in one internal CPython "digit", strings represented with 1-byte characters, ... at least) it resolves to special functions defined in listobject.c, which are leaner and faster than the base types' __lt__ implementations (note that the functions in listobject.c have no logic at all to compute anything other than <, require no type checks, and don't cater to the possibility of needing conversions - the pre-scan of the list that set this all up ensured type homogeneity).

The two calls may be faster then. In general, though, I expect the two calls would be slower.

pochmann · 2021-10-20T04:04:35Z

@tim-one Even for latin strings, I think if they're long enough, two unsafe_latin_compare could take twice as long as one __eq__, right?

Related: The comment above unsafe_tuple_compare says "The idea is that most tuple compares don't involve x[1:]". At first that seemed right to me, and it would mean that "half of most of the time", you'd only need one tuple_elem_compare call, not two. So on average it would take 1.5 tuple_elem_compare. But after the analysis in my stackoverflow answer I'm not so sure anymore. In the smallest case, 11.01 out of 11.99 tuple comparisons were decided at the first element. But in the largest case, only 12.06 out of 21.26 were. I think partly because there were more duplicates at the first element, but also partly because the second element frequently differed, causing further tuple comparisons, which involves comparing equal first elements again.

Objects/listobject.c

pochmann · 2021-10-20T04:22:09Z

Summary of how I see it:

Supposedly common case, where the first element differs:

Current way: 1 slow == and 1 fast <.
Proposed way: 1 or 2 fast <.
=> Winner: the proposed way

Supposedly rare case, where the first element is equal.

Current way: 1 slow ==.
Proposed way: 2 fast <.
=> Winner: depends on type/values.

=> Winner overall: Also depends on how common/rare equality at the first element really is.

tim-one · 2021-10-20T04:38:58Z

@pochmann, yes, if two latin strings are equal, memcmp will have to look at every pair of characters regardless of which comparison outcome is asked for.

The patch here appears to be pretty much a wash for the StackOverflow program. I don't care - I think his keys were obviously and highly contrived, and so was his raw data. It wasn't "a real program" in any respect. But it was a program anyone could run as-is, which is the primary thing on SO.

You noted that there a were lot of duplicates among the ''.join(sorted(x)) keys, but there are far more duplicates among the x[::2] keys: the latter effectively builds a 3-digit decimal integer out of a 6-digit decimal integer, and so there are only about a thousand possible distinct results.

pochmann · 2021-10-20T06:45:33Z

@tim-one Yes, the stackoverflow question uses contrived data, I just mean it made me aware that differences at the second element can cause a lot of additional first-element comparisons. Is it unrealistic? If you for example sort events by day and then by time, you likely do have days duplicated a lot and not much duplication among times within each day.

I noticed the higher duplicate among x[::2] later but forgot to update. Done now. It's less relevant, though. Duplicates of the primary keys are more relevant, as they allow the secondary keys to play a role, which can then cause the additional fruitless comparisons of equal primary-keys. I tried the experiment again with different secondary keys. With less secondary-key duplication, the == comparisons for the primary went a bit further up.

Comparisons per element, with the original x[::2] secondary key:
21.26 == ''.join(sorted(x))    (the primary key)
12.06 < ''.join(sorted(x))
 9.20 == x[::2]                (the secondary key)
 6.68 < x[::2]

With whole x as secondary key, i.e., only little duplication
(but still correlated with the primary key):
21.96 == ''.join(sorted(x))
12.03 < ''.join(sorted(x))
 9.92 == x
 8.12 < x

With random() as secondary key, i.e., likely no duplication
(and no relationship with the primary key):
21.96 == ''.join(sorted(x))
12.03 < ''.join(sorted(x))
 9.93 == random()
 9.93 < random()

With the string "x" as secondary key, i.e., complete duplication:
16.25 == ''.join(sorted(x))
13.43 < ''.join(sorted(x))
 2.82 == "x"

resolved by the very first tuple elements, and adjust strategy accordingly.

strategy. This looks to be quite successful. It loses a few per cent in speed in cases that always want to use the cheaper tests, but can gain far more (compared to this branch's state before this commit) in some cases where PyObject_RichCompareBool(..., Py_EQ) typically returns 1 (they're equal) when applied to the first pair.

Swap the order of if/else blocks to put the more likely block first.

…outcomes first.

…nGH-29076)" This reverts commit 51ed2c5.

First stab. About 40% speedup on tupsort.py's "(float,)" case.

03b88d6

bedevere-bot added the awaiting core review label Oct 19, 2021

the-knights-who-say-ni added the CLA signed label Oct 19, 2021

tim-one self-assigned this Oct 19, 2021

tim-one added skip news skip issue and removed skip issue labels Oct 19, 2021

tim-one changed the title ~~First stab. About 40% speedup on tupsort.py's "(float,)" case.~~ bpo-45530: First stab. About 40% speedup on tupsort.py's "(float,)" case. Oct 20, 2021

tim-one changed the title ~~bpo-45530: First stab. About 40% speedup on tupsort.py's "(float,)" case.~~ bpo-45530: speed listobject.c's unsafe_tuple_compare() Oct 20, 2021

tim-one removed the skip news label Oct 20, 2021

📜🤖 Added by blurb_it.

01bbf65

rhettinger reviewed Oct 20, 2021

View reviewed changes

Objects/listobject.c Outdated Show resolved Hide resolved

rhettinger approved these changes Oct 20, 2021

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Oct 20, 2021

ambv and others added 7 commits October 20, 2021 20:35

Add reordering notification to whatsnew, clarify Blurb a little

4551c49

Fix invalid ref in Blurb

67c4cc9

Keep track of whether unsafe_tuple_compare() calls are

5b28703

resolved by the very first tuple elements, and adjust strategy accordingly.

Add a clariying assert.

ff23c7b

Swap the order of if/else blocks to put the more likely block first.

Remove the new firsti vrbl - needless name proliferation.

cd69e8b

Errors from comparisons are rare, so rearrange code to act on normal …

7c158f6

…outcomes first.

tim-one merged commit 51ed2c5 into python:main Oct 25, 2021

bedevere-bot removed the awaiting merge label Oct 25, 2021

tim-one deleted the tsort branch October 25, 2021 03:27

jacobtylerwalls mentioned this pull request Jul 23, 2022

Sorting tuples containing None raises TypeError in 3.11 #95173

Closed

pablogsal mentioned this pull request Jul 23, 2022

gh-95173: Revert commit 51ed2c56a1852cd6b09c85ba81312dc9782772ce #95176

Merged

pablogsal added a commit to pablogsal/cpython that referenced this pull request Jul 23, 2022

Revert "bpo-45530: speed listobject.c's unsafe_tuple_compare() (pytho…

5a3ebf3

…nGH-29076)" This reverts commit 51ed2c5.

jacobtylerwalls mentioned this pull request Aug 1, 2022

gh-95173: Add a regression test for sorting tuples containing None #95464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-45530: speed listobject.c's unsafe_tuple_compare() #29076

bpo-45530: speed listobject.c's unsafe_tuple_compare() #29076

Uh oh!

tim-one commented Oct 19, 2021 •

edited by bedevere-bot

Loading

Uh oh!

sweeneyde commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

rhettinger commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

pochmann commented Oct 20, 2021 •

edited

Loading

Uh oh!

Uh oh!

pochmann commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

pochmann commented Oct 20, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

bpo-45530: speed listobject.c's unsafe_tuple_compare() #29076

bpo-45530: speed listobject.c's unsafe_tuple_compare() #29076

Uh oh!

Conversation

tim-one commented Oct 19, 2021 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sweeneyde commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

rhettinger commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

pochmann commented Oct 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pochmann commented Oct 20, 2021

Uh oh!

tim-one commented Oct 20, 2021

Uh oh!

pochmann commented Oct 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tim-one commented Oct 19, 2021 •

edited by bedevere-bot

Loading

pochmann commented Oct 20, 2021 •

edited

Loading

pochmann commented Oct 20, 2021 •

edited

Loading