Lazier LazyList.#6880
Conversation
As lamented in scala/bug#10696 and bemoaned in scala/collection-strawman#367, `LazyList` (and `Stream` before it) does not have a way of representing a collection with uncomputed emptiness. This adds a third subclass of `LazyList`, `LazyList.Suspended`, which wraps a closure returning a `LazyList`, and only evaluates it when needed. This is about as lazy of a list as I can imagine, now. Fixes scala/bug#10696. Fixes scala/collection-strawman#367.
| else iterableFactory.empty).asInstanceOf[C] | ||
| } | ||
| private[immutable] def filterImpl(p: A => Boolean, isFlipped: Boolean): C = | ||
| suspend0 { |
There was a problem hiding this comment.
just using suspend here complains that we got C but expected CC[A]
There was a problem hiding this comment.
You can add a constraint that C <: CC[A], if that helps. But such constraints usually make things hard to abstract over.
There was a problem hiding this comment.
C is defined as
+C <: CC[A] with LazyListOps[A, CC, C]The problem (I think) is that we don't know that the iterableFactory is going to return the exact same collection type (C) when asked to produce a collection type wrapping A, even though we know/hope that C =:= CC[A].
| f(s) match { | ||
| case None => empty[A] | ||
| case Some((a, s1)) => newCons(a, loop(s1)) | ||
| } |
| eval = null | ||
| while (res.isInstanceOf[Suspended[_]]) { | ||
| // skip through multiple suspensions in a loop rather than using the stack | ||
| res = res.asInstanceOf[Suspended[A]].next |
There was a problem hiding this comment.
Is there a test to make sure this is stack-safe?
There was a problem hiding this comment.
There will be.
- test for stack safety
There was a problem hiding this comment.
I'm actually fairly certain this isn't stack-safe, because each call to next checks isInstanceOf[Suspended[_]], and then calls next on the result. Thus, you get a call stack (of sorts) like the one below
res.res.res.res.next
at res.res.res.next
at res.res.next
at res.next
at this.next
Edit: I have tested it
scala> def nest(n: Int): LazyList[Int] = {
| if (n > 0) LazyList.suspend(nest(n - 1))
| else LazyList.empty[Int]
| }
nest: (n: Int)LazyList[Int]
scala> nest(10000)
java.lang.StackOverflowError
at .$anonfun$nest$1(<console>:2)
at scala.collection.immutable.LazyList$Suspended.next$lzycompute(LazyList.scala:685)
at scala.collection.immutable.LazyList$Suspended.next(LazyList.scala:683)
at scala.collection.immutable.LazyList$Suspended.next$lzycompute(LazyList.scala:689)
at scala.collection.immutable.LazyList$Suspended.next(LazyList.scala:683)(it also seems to be losing the laziness somewhere, but I'm not sure where)
| * | ||
| * @param eval a closure which will create another [[LazyList]] | ||
| */ | ||
| def suspend[A](eval: => LazyList[A]): LazyList[A] = |
There was a problem hiding this comment.
I'm not sure how I feel about the name suspend, both internally and externally, but especially as an externally visible API. How do you feel about the names defer and Deferred?
There was a problem hiding this comment.
I'm okay with defer. Since we had the Deferrer class, I didn't want to overload the word.
| } | ||
|
|
||
| @SerialVersionUID(3L) | ||
| final class Suspended[A](var eval: () => LazyList[A]) extends LazyList[A] { |
There was a problem hiding this comment.
I'm not sure we need the Deferrer class anymore, if we have this
There was a problem hiding this comment.
I'd love to get rid of the Deferrer class, but doing so makes this break:
val cycle1: LazyList[Int] = 1 #:: 2 #:: cycle1and call me a crazy functional programmer, but I like being able to do that.
(1 #:: 2 #:: suspend(cycle1) is a workaround, but not one I find all that pretty.)
| tl | ||
| val res = tl() | ||
| tl = null | ||
| res |
NthPortal
left a comment
There was a problem hiding this comment.
I think it could use a few more tests to assert that it's fully lazy, but that's about it.
Awesome!
| } | ||
|
|
||
| assertEquals(true, Try { wf.map(identity) }.isFailure) // throws on n == 5 | ||
| assertEquals(true, Try { wf.map(identity).force }.isFailure) // throws on n == 5 |
There was a problem hiding this comment.
It might be more worthwhile to change the tests to assert better laziness
|
|
||
| @SerialVersionUID(3L) | ||
| final class Suspended[A](var eval: () => LazyList[A]) extends LazyList[A] { | ||
| private[this] var evaluated: Boolean = false |
There was a problem hiding this comment.
should this be @volatile? (same question for hdDefined and tlDefined in Cons)
There was a problem hiding this comment.
I have no idea. I think maybe, although since head's protected by its own volatile bitmap field, the danger is that one thread will call head, and the other thread will get false from headDefined, which is unfortunate but probably not harmful.
I think the evaluated = true should go after the call to eval, though, now that I think on it.
There was a problem hiding this comment.
I've been thinking about this a bunch, and I think that, because it's not volatile, the JMM allows another thread to read a value of true from evaluated, as long as it gets set to true at some point (possibly only if set by another thread?). Basically, without @volatile, there is no happens-before relationship between the write of true and the read of true if they (might?) happen on different threads, so it might not happen before.
I would ask a JMM expert before taking that as gospel truth, but it seems racy to me.
|
Thank you for this! I won't be able to properly review this until July 12. I'm just leaving two high-level comments for now:
As added food for thought: I wonder whether it is still useful that the head be lazy. Are there use cases where one already knows that the lazy list is non-empty, but doesn't know yet what its |
|
@sjrd It is conceivable to me that there exists some situation where |
|
@NthPortal Yes, that is definitely conceivable. But at call site, is it likely that you want to call |
| hd | ||
| val res = hd() | ||
| hd = null | ||
| res |
There was a problem hiding this comment.
The goal here is to let the thunk be garbage collected?
julienrf
left a comment
There was a problem hiding this comment.
Thanks @hrhino for working on this!
It seems that your design is equivalent to the one proposed by @sjrd in scala/collection-strawman#367 (comment) but I must say that I have a slight preference for his design because it prevents returning an eager LazyList by mistake (like we currently do in Process#lazyLines).
Also, I have a preference for renaming suspend into defer, as this was suggested by someone else.
|
Maybe it still makes sense to have a lazy def expensiveList(n: Int): LazyList[Int] =
LazyList {
if (n == 0) State.empty
else State.nonEmpty(heavyComputation(n), expensiveList(n - 1))
}In this example, it might be useful in some cases to evaluate the fact that the underlying state is |
|
Your example is again based on the definition site. It does not address what I said #6880 (comment) any more than what has previously been said. |
|
@julienrf: I did originally have a design similar to @sjrd's, but I figured there was too much indirection calling
How do you mean? The relevant bit called by def next(): LazyList[T] = LazyList.suspend(q.take match {
case Left(0) => LazyList.empty
case Left(code) => if (nonzeroException) scala.sys.error("Nonzero exit code: " + code) else LazyList.empty
case Right(s) => LazyList.cons(s, next())
})(previously the I'm going to take Sébastien's advice and make the subclasses private; if we ensure that the With reference to @sjrd's point about strictness in the head: Would |
For instance when we implement transformation operations such as |
|
@sjrd I guess there can be some algorithms that care about the fact that a lazy list is empty or not, and then defer when they actually evaluate the head to a later point. OK, I have no examples… |
|
@hrhino What do you think about #6880 (comment) ? |
|
@julienrf I think that it's a fair point (and I'm rewriting this to use Sébastien's design), but I think doesn't go far enough on its own; we also need to take care of those |
|
@hrhino For methods such as |
|
@hrhino Will you have time to continue this work? Do you want any help? |
|
Yes, I'll pick it up this weekend. It's been really busy at work; sorry. I'll shout in the strawman gitter if I get confused. Thanks for the poke. When's the feature freeze deadline? |
|
The targeted deadline is the 10th of August. |
|
Any update on this @hrhino? If you are too busy maybe you can push your current state somewhere and someone else can finish the work? |
|
Yes, sorry, I may need to do that. |
Status summaryWe have decided it is best to go with an implementation like the one suggested by @sjrd; specifically, that With this new internal structure, significantly less is shared with Thus, there are two major tasks to be accomplished to make a better
I am currently working on (1). There are various aspects of the previous |
|
Regarding (2), I’m not sure porting (again) the 2.12 implementation to the new design would be simpler than just inlining the |
|
@julienrf I'm open to that as well. |
|
I'm closing this in favor of @NthPortal's work. Sorry for the false alarm. |
|
See #7000 |
As lamented in scala/bug#10696 and bemoaned in scala/collection-strawman#367,
LazyList(andStreambefore it) does not have a way of representing a collection with uncomputed emptiness. This adds a third subclass ofLazyList,LazyList.Suspended, which wraps a closure returning aLazyList, and only evaluates it when needed. This is about as lazy of a list as I can imagine, now.Two tests are currently failing, both because
ll.filter(...).map(...)is even lazier than contemplated. I'm not sure how to test for the linked bug (scala/bug#9134); I'll take suggestions or think about it over the vacation.BTW, this is the very first thing I've done with the new collections, so I hope it's stylistically alright. Let me know if I've broken convention or something.
Fixes scala/bug#10696.
Fixes scala/collection-strawman#367.