gh-107868 Add an `O(1)` fastpath for `sum(range(...))` #107870

mcognetta · 2023-08-11T15:52:11Z

This adds a fastpath for sum(range(...)), which takes O(1) time. This partially resolves #68264 (but does not address the core issue of why there was a slowdown overall) as well as #107868.

Note: I am still not too familiar with cpython's internals. I believe this is a reasonable place to implement it, but I think the best way would be to add sum to PySequenceMethods, except that that would break ABI backwards compatibility, right?

The other thing I am worried about is that I did not implement the chain of operators (and their reference decrementing) in the most idiomatic way. Any advice would be appreciated.

Some example timings are below (I do not have python2 on my machine to compare to, but #68264 shows it would be closer to the main branch timings than this PR):

Main:

>>> import time
>>> t=time.time();sum(range(1,pow(10,8)+1));print(time.time()-t)
5000000050000000
3.165787696838379

This PR:

>>> import time
>>> t=time.time();sum(range(1,pow(10,8)+1));print(time.time()-t)
5000000050000000
7.939338684082031e-05

Issue: Make an O(1) fastpath for sum(range(...)) #107868

bedevere-bot · 2023-08-11T15:52:17Z

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

Eclips4 · 2023-08-11T16:09:15Z

Lib/test/test_builtin.py

@@ -1626,8 +1626,12 @@ def test_sum(self):

        self.assertEqual(sum(range(10), 1000), 1045)
        self.assertEqual(sum(range(10), start=1000), 1045)
+        self.assertEqual(sum(range(10), 0.1), 45.1)


I don't think that adding new test cases about sum is part of this PR.
If you want to improve sum tests, you can open separate issue/PR.

However, as a rule, float objects is not compared by assertEqual. You should use assertAlmostEqual

Fixed. I will move them to another PR if that would be better.

Eclips4 · 2023-08-11T16:12:08Z

Python/bltinmodule.c

+    PyObject* start = PyObject_GetAttrString(range, "start");
+    PyObject* step = PyObject_GetAttrString(range, "step");
+
+    PyObject* one = PyLong_FromLong(1);
+    PyObject* a = PyNumber_Subtract(length, one);
+
+    PyObject* b = PyNumber_Multiply(a, length);
+
+    PyObject* two = PyLong_FromLong(2);
+    PyObject* c = PyNumber_FloorDivide(b, two);
+
+    PyObject* d = PyNumber_Multiply(step, c);
+
+    PyObject* e = PyNumber_Multiply(length, start);
+
+    PyObject* result = PyNumber_Add(d, e);


All of these calls can return NULL. You should prevent these situations.

To be clear, do you mean adding something like:

if (one == NULL) { Py_DECREF(one); return NULL; }

after every one of these? I would have to do it after

PyObject* rangesum = range_sum_fastpath(module, iterable); result = PyNumber_Add(result, rangesum); Py_DECREF(rangesum); return result;

as well then, I think.

Eclips4 · 2023-08-11T16:13:27Z

Python/bltinmodule.c

+    Py_DecRef(length);
+    Py_DecRef(start);
+    Py_DecRef(step);
+
+    Py_DecRef(one);
+    Py_DecRef(a);
+    Py_DecRef(b);
+    Py_DecRef(two);
+    Py_DecRef(c);
+    Py_DecRef(d);
+    Py_DecRef(e);


It's seems incorrect. You should use Py_DECREF(...) macro.
I mean, you should replace all calls of Py_DecRef to Py_DECREF (in the whole code which you has written)

Eclips4 · 2023-08-11T16:22:56Z

Python/bltinmodule.c

 static PyObject *
 builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start)
 /*[clinic end generated code: output=df758cec7d1d302f input=162b50765250d222]*/
 {
    PyObject *result = start;
    PyObject *temp, *item, *iter;

+    if (PyRange_Check(iterable)) {


builtin_sum_impl written with the usage of Argument Clinic. So, you should run AC on this file to re-generate checksum's (you can see this at the beggining of this function).
More details about the Argument Clinic you can read here:
https://docs.python.org/3.13/howto/clinic.html

Eclips4 · 2023-08-11T16:38:01Z

Also, I would prefer to have a NEWS entry for this PR :)

mcognetta · 2023-08-11T16:56:46Z

Thanks for your reviews. I will fix the clinic stuff tomorrow.

There is one other issue that I am concerned about. When running something like:

>>> r = range(2**1000, 2**1000 + 1000, 999)
>>> sum(r)
21430172143725346418968500981200036211228096234110672148875007767407021022498722449863967576313917162551893458351062936503742905713846280871969155149397149607869135549648461970842149210124742283755908364306092949967163882534797535118331087892154125829142392955373084335320859663305248773674411336139751
>>> r = range(2**64)
>>> sum(r)
Segmentation fault (core dumped)

There is a segfault, even though it should be able to represent all of these numbers.

Actually, in a prior version of this implementation, doing sum(range(2**64)) wouldn't segfault, but it would say that the code returned a error as it could not be represented in a ssize_t (I don't have the exact error unfortunately).

I am not sure what the result would be if you ran that same code on main, as it would take too long to finish. But it concerns me that this is clearly able to generate numbers larger than the system max, but in some cases it fails. What do you think?

Misc/NEWS.d/next/Core and Builtins/2023-08-11-16-44-29.gh-issue-107868.BgT7zE.rst

mcognetta · 2023-08-13T12:05:50Z

Closing due to the consensus that the added complexity is not worth it since this use case is so rare.

add fastpath for sum(range(...))

10ffc47

bedevere-bot mentioned this pull request Aug 11, 2023

Make an O(1) fastpath for sum(range(...)) #107868

Closed

bedevere-bot added the awaiting review label Aug 11, 2023

Eclips4 reviewed Aug 11, 2023

View reviewed changes

Eclips4 added the performance Performance or resource usage label Aug 11, 2023

Eclips4 reviewed Aug 11, 2023

View reviewed changes

fix assert

bd2c5f9

This comment was marked as resolved.

Sign in to view

fix Py_DECREF

2a0b4a3

This comment was marked as resolved.

Sign in to view

📜🤖 Added by blurb_it.

86bdcbf

Eclips4 reviewed Aug 11, 2023

View reviewed changes

Misc/NEWS.d/next/Core and Builtins/2023-08-11-16-44-29.gh-issue-107868.BgT7zE.rst Outdated Show resolved Hide resolved

fix news and acks

1bf1688

mcognetta closed this Aug 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-107868 Add an `O(1)` fastpath for `sum(range(...))` #107870

gh-107868 Add an `O(1)` fastpath for `sum(range(...))` #107870

Uh oh!

mcognetta commented Aug 11, 2023 •

edited

Loading

Uh oh!

bedevere-bot commented Aug 11, 2023

Uh oh!

Eclips4 Aug 11, 2023

Uh oh!

mcognetta Aug 11, 2023

Uh oh!

Eclips4 Aug 11, 2023

Uh oh!

mcognetta Aug 11, 2023

Uh oh!

Eclips4 Aug 11, 2023 •

edited

Loading

Uh oh!

mcognetta Aug 11, 2023

Uh oh!

Eclips4 Aug 11, 2023 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Eclips4 commented Aug 11, 2023

Uh oh!

This comment was marked as resolved.

mcognetta commented Aug 11, 2023

Uh oh!

Uh oh!

mcognetta commented Aug 13, 2023

Uh oh!

Uh oh!

Uh oh!

gh-107868 Add an O(1) fastpath for sum(range(...)) #107870

gh-107868 Add an O(1) fastpath for sum(range(...)) #107870

Uh oh!

Conversation

mcognetta commented Aug 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-bot commented Aug 11, 2023

Uh oh!

Eclips4 Aug 11, 2023

Choose a reason for hiding this comment

Uh oh!

mcognetta Aug 11, 2023

Choose a reason for hiding this comment

Uh oh!

Eclips4 Aug 11, 2023

Choose a reason for hiding this comment

Uh oh!

mcognetta Aug 11, 2023

Choose a reason for hiding this comment

Uh oh!

Eclips4 Aug 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcognetta Aug 11, 2023

Choose a reason for hiding this comment

Uh oh!

Eclips4 Aug 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Eclips4 commented Aug 11, 2023

Uh oh!

This comment was marked as resolved.

mcognetta commented Aug 11, 2023

Uh oh!

Uh oh!

mcognetta commented Aug 13, 2023

Uh oh!

Uh oh!

gh-107868 Add an `O(1)` fastpath for `sum(range(...))` #107870

gh-107868 Add an `O(1)` fastpath for `sum(range(...))` #107870

mcognetta commented Aug 11, 2023 •

edited

Loading

Eclips4 Aug 11, 2023 •

edited

Loading

Eclips4 Aug 11, 2023 •

edited

Loading