Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 657fe38

Browse files
committed
Large code rearrangement to use better algorithms, in the sense of needing
substantially fewer array-element compares. This is best practice as of Kntuh Volume 3 Ed 2, and the code is actually simpler this way (although the key idea may be counter-intuitive at first glance! breaking out of a loop early loses when it costs more to try to get out early than getting out early saves). Also added a comment block explaining the difference and giving some real counts; demonstrating that heapify() is more efficient than repeated heappush(); and emphasizing the obvious point thatlist.sort() is more efficient if what you really want to do is sort.
1 parent 6bdbc9e commit 657fe38

1 file changed

Lines changed: 79 additions & 39 deletions

File tree

Lib/heapq.py

Lines changed: 79 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -126,51 +126,16 @@
126126

127127
def heappush(heap, item):
128128
"""Push item onto heap, maintaining the heap invariant."""
129-
pos = len(heap)
130-
heap.append(None)
131-
while pos:
132-
parentpos = (pos - 1) >> 1
133-
parent = heap[parentpos]
134-
if item >= parent:
135-
break
136-
heap[pos] = parent
137-
pos = parentpos
138-
heap[pos] = item
139-
140-
# The child indices of heap index pos are already heaps, and we want to make
141-
# a heap at index pos too.
142-
def _siftdown(heap, pos):
143-
endpos = len(heap)
144-
assert pos < endpos
145-
item = heap[pos]
146-
# Sift item into position, down from pos, moving the smaller
147-
# child up, until finding pos such that item <= pos's children.
148-
childpos = 2*pos + 1 # leftmost child position
149-
while childpos < endpos:
150-
# Set childpos and child to reflect smaller child.
151-
child = heap[childpos]
152-
rightpos = childpos + 1
153-
if rightpos < endpos:
154-
rightchild = heap[rightpos]
155-
if rightchild < child:
156-
childpos = rightpos
157-
child = rightchild
158-
# If item is no larger than smaller child, we're done, else
159-
# move the smaller child up.
160-
if item <= child:
161-
break
162-
heap[pos] = child
163-
pos = childpos
164-
childpos = 2*pos + 1
165-
heap[pos] = item
129+
heap.append(item)
130+
_siftdown(heap, 0, len(heap)-1)
166131

167132
def heappop(heap):
168133
"""Pop the smallest item off the heap, maintaining the heap invariant."""
169134
lastelt = heap.pop() # raises appropriate IndexError if heap is empty
170135
if heap:
171136
returnitem = heap[0]
172137
heap[0] = lastelt
173-
_siftdown(heap, 0)
138+
_siftup(heap, 0)
174139
else:
175140
returnitem = lastelt
176141
return returnitem
@@ -184,7 +149,82 @@ def heapify(x):
184149
# j-1 is the largest, which is n//2 - 1. If n is odd = 2*j+1, this is
185150
# (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.
186151
for i in xrange(n//2 - 1, -1, -1):
187-
_siftdown(x, i)
152+
_siftup(x, i)
153+
154+
# 'heap' is a heap at all indices >= startpos, except possibly for pos. pos
155+
# is the index of a leaf with a possibly out-of-order value. Restore the
156+
# heap invariant.
157+
def _siftdown(heap, startpos, pos):
158+
newitem = heap[pos]
159+
# Follow the path to the root, moving parents down until finding a place
160+
# newitem fits.
161+
while pos > startpos:
162+
parentpos = (pos - 1) >> 1
163+
parent = heap[parentpos]
164+
if parent <= newitem:
165+
break
166+
heap[pos] = parent
167+
pos = parentpos
168+
heap[pos] = newitem
169+
170+
# The child indices of heap index pos are already heaps, and we want to make
171+
# a heap at index pos too. We do this by bubbling the smaller child of
172+
# pos up (and so on with that child's children, etc) until hitting a leaf,
173+
# then using _siftdown to move the oddball originally at index pos into place.
174+
#
175+
# We *could* break out of the loop as soon as we find a pos where newitem <=
176+
# both its children, but turns out that's not a good idea, and despite that
177+
# many books write the algorithm that way. During a heap pop, the last array
178+
# element is sifted in, and that tends to be large, so that comparing it
179+
# against values starting from the root usually doesn't pay (= usually doesn't
180+
# get us out of the loop early). See Knuth, Volume 3, where this is
181+
# explained and quantified in an exercise.
182+
#
183+
# Cutting the # of comparisons is important, since these routines have no
184+
# way to extract "the priority" from an array element, so that intelligence
185+
# is likely to be hiding in custom __cmp__ methods, or in array elements
186+
# storing (priority, record) tuples. Comparisons are thus potentially
187+
# expensive.
188+
#
189+
# On random arrays of length 1000, making this change cut the number of
190+
# comparisons made by heapify() a little, and those made by exhaustive
191+
# heappop() a lot, in accord with theory. Here are typical results from 3
192+
# runs (3 just to demonstrate how small the variance is):
193+
#
194+
# Compares needed by heapify Compares needed by 1000 heapppops
195+
# -------------------------- ---------------------------------
196+
# 1837 cut to 1663 14996 cut to 8680
197+
# 1855 cut to 1659 14966 cut to 8678
198+
# 1847 cut to 1660 15024 cut to 8703
199+
#
200+
# Building the heap by using heappush() 1000 times instead required
201+
# 2198, 2148, and 2219 compares: heapify() is more efficient, when
202+
# you can use it.
203+
#
204+
# The total compares needed by list.sort() on the same lists were 8627,
205+
# 8627, and 8632 (this should be compared to the sum of heapify() and
206+
# heappop() compares): list.sort() is (unsurprisingly!) more efficent
207+
# for sorting.
208+
209+
def _siftup(heap, pos):
210+
endpos = len(heap)
211+
startpos = pos
212+
newitem = heap[pos]
213+
# Bubble up the smaller child until hitting a leaf.
214+
childpos = 2*pos + 1 # leftmost child position
215+
while childpos < endpos:
216+
# Set childpos to index of smaller child.
217+
rightpos = childpos + 1
218+
if rightpos < endpos and heap[rightpos] < heap[childpos]:
219+
childpos = rightpos
220+
# Move the smaller child up.
221+
heap[pos] = heap[childpos]
222+
pos = childpos
223+
childpos = 2*pos + 1
224+
# The leaf at pos is empty now. Put newitem there, and and bubble it up
225+
# to its final resting place (by sifting its parents down).
226+
heap[pos] = newitem
227+
_siftdown(heap, startpos, pos)
188228

189229
if __name__ == "__main__":
190230
# Simple sanity test

0 commit comments

Comments
 (0)