Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit f6ed074

Browse files
committed
This no longer leaks memory when run in an infinite loop. However,
that required explicitly calling LazyList.clear() in the two tests that use LazyList (I added a LazyList Fibonacci generator too). A real bitch: the extremely inefficient first version of the 2-3-5 test *looked* like a slow leak on Win98SE, but it wasn't "really": it generated so many results that the heap grew over 4Mb (tons of frames! the number of frames grows exponentially in that test). Then Win98SE malloc() starts fragmenting address space allocating more and more heaps, and the visible memory use grew very slowly while the disk was thrashing like mad. Printing fewer results (i.e., keeping the heap burden under 4Mb) made that illusion vanish. Looks like there's no hope for plugging the LazyList leaks automatically short of adding frameobjects and genobjects to gc. OTOH, they're very easy to break by hand, and they're the only *kind* of plausibly realistic leaks I've been able to provoke. Dilemma.
1 parent ce9b5a5 commit f6ed074

1 file changed

Lines changed: 58 additions & 81 deletions

File tree

Lib/test/test_generators.py

Lines changed: 58 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -445,15 +445,13 @@
445445
>>> firstn(primes, 20)
446446
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
447447
448+
448449
Another famous problem: generate all integers of the form
449450
2**i * 3**j * 5**k
450451
in increasing order, where i,j,k >= 0. Trickier than it may look at first!
451452
Try writing it without generators, and correctly, and without generating
452453
3 internal results for each result output.
453454
454-
XXX Suspect there's memory leaks in this one; definitely in the next
455-
XXX version.
456-
457455
>>> def times(n, g):
458456
... for i in g:
459457
... yield n * i
@@ -475,11 +473,11 @@
475473
... ng = g.next()
476474
... nh = h.next()
477475
478-
This works, but is doing a whale of a lot or redundant work -- it's not
479-
clear how to get the internal uses of m235 to share a single generator.
480-
Note that me_times2 (etc) each need to see every element in the result
481-
sequence. So this is an example where lazy lists are more natural (you
482-
can look at the head of a lazy list any number of times).
476+
The following works, but is doing a whale of a lot of redundant work --
477+
it's not clear how to get the internal uses of m235 to share a single
478+
generator. Note that me_times2 (etc) each need to see every element in the
479+
result sequence. So this is an example where lazy lists are more natural
480+
(you can look at the head of a lazy list any number of times).
483481
484482
>>> def m235():
485483
... yield 1
@@ -491,22 +489,26 @@
491489
... me_times5):
492490
... yield i
493491
492+
Don't print "too many" of these -- the implementation above is extremely
493+
inefficient: each call of m235() leads to 3 recursive calls, and in
494+
turn each of those 3 more, and so on, and so on, until we've descended
495+
enough levels to satisfy the print stmts. Very odd: when I printed 5
496+
lines of results below, this managed to screw up Win98's malloc in "the
497+
usual" way, i.e. the heap grew over 4Mb so Win98 started fragmenting
498+
address space, and it *looked* like a very slow leak.
499+
494500
>>> result = m235()
495-
>>> for i in range(5):
501+
>>> for i in range(3):
496502
... print firstn(result, 15)
497503
[1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]
498504
[25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80]
499505
[81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
500-
[200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
501-
[400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
502506
503507
Heh. Here's one way to get a shared list, complete with an excruciating
504508
namespace renaming trick. The *pretty* part is that the times() and merge()
505509
functions can be reused as-is, because they only assume their stream
506510
arguments are iterable -- a LazyList is the same as a generator to times().
507511
508-
XXX Massive memory leaks in this; see Python-Iterators.
509-
510512
>>> class LazyList:
511513
... def __init__(self, g):
512514
... self.sofar = []
@@ -517,6 +519,9 @@
517519
... while i >= len(sofar):
518520
... sofar.append(fetch())
519521
... return sofar[i]
522+
...
523+
... def clear(self):
524+
... self.__dict__.clear()
520525
521526
>>> def m235():
522527
... yield 1
@@ -529,6 +534,9 @@
529534
... me_times5):
530535
... yield i
531536
537+
Print as many of these as you like -- *this* implementation is memory-
538+
efficient. XXX Except that it leaks unless you clear the dict!
539+
532540
>>> m235 = LazyList(m235())
533541
>>> for i in range(5):
534542
... print [m235[j] for j in range(15*i, 15*(i+1))]
@@ -537,6 +545,34 @@
537545
[81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
538546
[200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
539547
[400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
548+
549+
>>> m235.clear() # XXX memory leak without this
550+
551+
552+
Ye olde Fibonacci generator, LazyList style.
553+
554+
>>> def fibgen(a, b):
555+
...
556+
... def sum(g, h):
557+
... while 1:
558+
... yield g.next() + h.next()
559+
...
560+
... def tail(g):
561+
... g.next() # throw first away
562+
... for x in g:
563+
... yield x
564+
...
565+
... yield a
566+
... yield b
567+
... for s in sum(iter(fib),
568+
... tail(iter(fib))):
569+
... yield s
570+
571+
>>> fib = LazyList(fibgen(1, 2))
572+
>>> firstn(iter(fib), 17)
573+
[1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584]
574+
575+
>>> fib.clear() # XXX memory leak without this
540576
"""
541577

542578
# syntax_tests mostly provokes SyntaxErrors. Also fiddling with #if 0
@@ -670,72 +706,11 @@
670706
<type 'None'>
671707
"""
672708

673-
674-
x_tests = """
675-
676-
>>> def firstn(g, n):
677-
... return [g.next() for i in range(n)]
678-
679-
>>> def times(n, g):
680-
... for i in g:
681-
... yield n * i
682-
683-
>>> def merge(g, h):
684-
... ng = g.next()
685-
... nh = h.next()
686-
... while 1:
687-
... if ng < nh:
688-
... yield ng
689-
... ng = g.next()
690-
... elif ng > nh:
691-
... yield nh
692-
... nh = h.next()
693-
... else:
694-
... yield ng
695-
... ng = g.next()
696-
... nh = h.next()
697-
698-
>>> class LazyList:
699-
... def __init__(self, g):
700-
... self.sofar = []
701-
... self.fetch = g.next
702-
...
703-
... def __getitem__(self, i):
704-
... sofar, fetch = self.sofar, self.fetch
705-
... while i >= len(sofar):
706-
... sofar.append(fetch())
707-
... return sofar[i]
708-
709-
>>> def m235():
710-
... yield 1
711-
... # Gack: m235 below actually refers to a LazyList.
712-
... me_times2 = times(2, m235)
713-
... me_times3 = times(3, m235)
714-
... me_times5 = times(5, m235)
715-
... for i in merge(merge(me_times2,
716-
... me_times3),
717-
... me_times5):
718-
... yield i
719-
720-
>>> m235 = LazyList(m235())
721-
>>> for i in range(5):
722-
... x = [m235[j] for j in range(15*i, 15*(i+1))]
723-
724-
725-
[1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24]
726-
[25, 27, 30, 32, 36, 40, 45, 48, 50, 54, 60, 64, 72, 75, 80]
727-
[81, 90, 96, 100, 108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192]
728-
[200, 216, 225, 240, 243, 250, 256, 270, 288, 300, 320, 324, 360, 375, 384]
729-
[400, 405, 432, 450, 480, 486, 500, 512, 540, 576, 600, 625, 640, 648, 675]
730-
"""
731-
732-
__test__ = {"tut": tutorial_tests, # clean
733-
"pep": pep_tests, # clean
734-
"email": email_tests, # clean
735-
"fun": fun_tests, # leaks
736-
"syntax": syntax_tests # clean
737-
#"x": x_tests
738-
}
709+
__test__ = {"tut": tutorial_tests,
710+
"pep": pep_tests,
711+
"email": email_tests,
712+
"fun": fun_tests,
713+
"syntax": syntax_tests}
739714

740715
# Magic test name that regrtest.py invokes *after* importing this module.
741716
# This worms around a bootstrap problem.
@@ -745,8 +720,10 @@ def test_main():
745720
import doctest, test_generators
746721
if 0:
747722
# Temporary block to help track down leaks. So far, the blame
748-
# has fallen mostly on doctest.
749-
for i in range(5000):
723+
# fell mostly on doctest. Later: the only leaks remaining are
724+
# in fun_tests, and only if you comment out the two LazyList.clear()
725+
# calls.
726+
for i in range(10000):
750727
doctest.master = None
751728
doctest.testmod(test_generators)
752729
else:

0 commit comments

Comments
 (0)