Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 65b8b84

Browse files
committed
roundupsize() and friends: fiddle over-allocation strategy for list
resizing. Accurate timings are impossible on my Win98SE box, but this is obviously faster even on this box for reasonable list.append() cases. I give credit for this not to the resizing strategy but to getting rid of integer multiplication and divsion (in favor of shifting) when computing the rounded-up size. For unreasonable list.append() cases, Win98SE now displays linear behavior for one-at-time appends up to a list with about 35 million elements. Then it dies with a MemoryError, due to fatally fragmented *address space* (there's plenty of VM available, but by this point Win9X has broken user space into many distinct heaps none of which has enough contiguous space left to resize the list, and for whatever reason Win9x isn't coalescing the dead heaps). Before the patch it got a MemoryError for the same reason, but once the list reached about 2 million elements. Haven't yet tried on Win2K but have high hopes extreme list.append() will be much better behaved now (NT & Win2K didn't fragment address space, but suffered obvious quadratic-time behavior before as lists got large). For other systems I'm relying on common sense: replacing integer * and / by << and >> can't plausibly hurt, the number of function calls hasn't changed, and the total operation count for reasonably small lists is about the same (while the operations are cheaper now).
1 parent 56a71ee commit 65b8b84

1 file changed

Lines changed: 31 additions & 8 deletions

File tree

Objects/listobject.c

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,40 @@
99
#include <sys/types.h> /* For size_t */
1010
#endif
1111

12-
#define ROUNDUP(n, PyTryBlock) \
13-
((((n)+(PyTryBlock)-1)/(PyTryBlock))*(PyTryBlock))
14-
1512
static int
1613
roundupsize(int n)
1714
{
18-
if (n < 500)
19-
return ROUNDUP(n, 10);
20-
else
21-
return ROUNDUP(n, 100);
22-
}
15+
unsigned int nbits = 0;
16+
unsigned int n2 = (unsigned int)n >> 5;
17+
18+
/* Round up:
19+
* If n < 256, to a multiple of 8.
20+
* If n < 2048, to a multiple of 64.
21+
* If n < 16384, to a multiple of 512.
22+
* If n < 131072, to a multiple of 4096.
23+
* If n < 1048576, to a multiple of 32768.
24+
* If n < 8388608, to a multiple of 262144.
25+
* If n < 67108864, to a multiple of 2097152.
26+
* If n < 536870912, to a multiple of 16777216.
27+
* ...
28+
* If n < 2**(5+3*i), to a multiple of 2**(3*i).
29+
*
30+
* This over-allocates proportional to the list size, making room
31+
* for additional growth. The over-allocation is mild, but is
32+
* enough to give linear-time amortized behavior over a long
33+
* sequence of appends() in the presence of a poorly-performing
34+
* system realloc() (which is a reality, e.g., across all flavors
35+
* of Windows, with Win9x behavior being particularly bad -- and
36+
* we've still got address space fragmentation problems on Win9x
37+
* even with this scheme, although it requires much longer lists to
38+
* provoke them than it used to).
39+
*/
40+
do {
41+
n2 >>= 3;
42+
nbits += 3;
43+
} while (n2);
44+
return ((n >> nbits) + 1) << nbits;
45+
}
2346

2447
#define NRESIZE(var, type, nitems) PyMem_RESIZE(var, type, roundupsize(nitems))
2548

0 commit comments

Comments
 (0)