-
Couldn't load subscription status.
- Fork 1.4k
Eliminate space leaks on ZStreams that concatenate infinitely #1952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This change fixes a leak with streams that concatenate infinitely. Previously, any stream that is used with Instead, we now keep concatenations in an initial representation that can be introspected before evaluation. During evaluation, we take care to reuse the same ZManaged scope and not introduce additional ones. This change currently does not fix space leaks caused by streams that recurse infinitely with ZStream#flatMap. It would seem that recursing with ZStream#flatMap isn't that useful (because this sort of infinite "downward" recursion indicates that no progress is made on the outer stream), but it's actually useful for embedding effects that control the recursion in the stream. For example: def go: Stream[Random, Int] =
Stream.fromEffect(random.nextInt).flatMap { i =>
if (i % 2 == 0) Stream.succeed(i) ++ go
else Stream.succeed(i)
}This will introduce another scope for every iteration until an odd number is hit. The previous version of ZStream#repeatWith uses a similar technique that solves the problem extremely elegantly. So it seems like something worth supporting. It is tempting to see if the same approach can solve this issue. I'll look into that over the next few weeks but if anyone would like to tackle that, feel free. |
|
Also, big thanks to @sebver who noticed the issue and shared the heap dumps with us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I think it's very elegant.
| .uninterruptible | ||
| r1 <- resR1.acquire | ||
| } yield r1, | ||
| acquire = ZIO.uninterruptibleMask { restore => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| * Appends another stream to this stream. The concatenated stream will first emit the | ||
| * elements of this stream, and then emit the elements of the `other` stream. | ||
| */ | ||
| final def concat[R1 <: R, E1 >: E, A1 >: A](other: => ZStream[R1, E1, A1]): ZStream[R1, E1, A1] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly ended up being the issue with the implementation of concat based on ZManaged#switchable that you posted in discord? I though it was a bit nicer than introducing structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question and good to have it written in this PR for posterity.
The previous ++ would do something like this (written minimally on purpose):
def ++(other: => ZStream) =
ZStream {
for {
switched <- Ref.make(false).toManaged_
currPull <- Ref.make(...)
pull = {
// get the current pull from the Ref, and pull it
// if it ended and !switched, switch over to `other`
// otherwise end
}
} yield pull
}As you can see, it allocates a bunch of resources that are retained for the existence of the stream (self ++ other).
Now, let's consider what happens in self.forever (defined as forever = self ++ forever) - or any stream that concatenates recursively. When we evaluate ++, it allocates these refs and starts pulling self. Once self is done, the ZManaged for other is opened and that one starts being pulled.
But, other is actually another ++, so more Ref instances are allocated! This happens on every switch to the RHS, so every recursion allocates more Ref. That would be fine if we could somehow stop retaining the current Ref, but we can't "exit" the scope of the current ZManaged. This is the root of the problem. Once we're inside a scope, we can't exit it.
So this formulation is heap-unsafe. I tried pretty hard, but I couldn't work around it properly. I think that the way to solve it is to interpret the recursive structure "from outside", before entering the scope that is recursive.
I hope this makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I understand. But this is a very subtle issue O.o
Thanks for the explanation and nice fix!
|
Massive 💪 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I just have cosmetic changes, hope they make sense.
| // below to swap the current stream being pulled with tl. | ||
| switchPull(hd.process).mapError(Some(_)).tap(currPull.set) *> | ||
| nextPull.set(Some(tl)) *> go(doneRef, currPull, nextPull, switchPull) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
trait Structure[R, E, A] {
def process: ZManaged[R, E, Pull[R, E, A]]
def tail: Option[() => Structure[R, E, A]]
}
case class Iterator[R, E, A](process: ZManaged[R, E, Pull[R, E, A]]) extends Structure[R, E, A] {
final val tail = None
}
case class Concat[-R, +E, +A](hd: Structure[R, E, A], tl: () => Structure[R, E, A])
extends Structure[R, E, A] {
final def process = hd.process
final def tail = Some(tl)
}
val structure = tl()
switchPull(structure.hd).mapError(Some(_)).tap(currPull.set) *>
nextPull.set(structure.tl) *> go(doneRef, currPull, nextPull, switchPull)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually going to avoid pulling more things up into Structure because I think this is going to end up as a mini-algebra, with an additional FlatMap constructor.
Then I'll move the interpreter up from Concat (which is the only composite term currently requiring interpretation) into Structure.
| (ZStream.fromEffect(clock.sleep(decision.delay)).drain ++ | ||
| self.map(f) ++ Stream.succeed(g(decision.finish()))).process | ||
|
|
||
| switchPull(nextPull).mapError(Some(_)).tap(currPull.set(_)) *> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you noticed that pattern is coming up all the time with switchable? I wonder if we could refactor without having to keep currPull and nextPull refs around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's some convenience to be had here: we could have the switchable constructor return something like:
trait Switchable[R, E, A] {
def switch(m: ZManaged[R, E, A]): ZIO[R, E, A]
def get: ZIO[R, E, A]
}
So the Ref is embedded in that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah it's actually not that simple. We would have to start requiring an initial resource, or get becomes UIO[Option[A]]. Beyond that, we need to think what happens if switch fails to acquire the resource. What would get return afterwards? That also pushes us towards UIO[Option[A]]. The problem is that sticking an Option in there can be a real pain for calling code.
|
I second @regiskuckaertz's suggestions. |
|
Thanks for reviewing @regiskuckaertz! At some point in this PR I deleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one🕺
| def go( | ||
| as: ZIO[R1, Option[E1], A], | ||
| finalizer: Ref[Exit[_, _] => URIO[R1, _]], | ||
| currPull: Ref[ZIO[R1, Option[E1], B]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| currPull: Ref[ZIO[R1, Option[E1], B]] | |
| currPull: Ref[Pull[R1, E1, B]] |
| as: ZIO[R1, Option[E1], A], | ||
| finalizer: Ref[Exit[_, _] => URIO[R1, _]], | ||
| currPull: Ref[ZIO[R1, Option[E1], B]] | ||
| ): ZIO[R1, Option[E1], B] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ): ZIO[R1, Option[E1], B] = { | |
| ): Pull[R1, E1, B] = { |
| } | ||
| } | ||
|
|
||
| private[stream] sealed abstract class Structure[-R, +E, +A] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to keep all the guts private so we can experiment throughout 1.x.
|
@jdegoes Would you like to have another look on this? Or is this good for merging? |
This type of recursion contains a space leak, as it introduces a new ZManaged scope for every repetition that is retained (along with all of its resources - AtomicReferences, for example) until the stream ends.
cf66cc3
410a27e to
cf66cc3
Compare
|
Merging this because it's needed to solve a memory leak with zio-kafka. @jdegoes Happy to discuss further if you have any additional thoughts! |
* Rewrite ZManaged#flatMap without for comprehensions * Add ZStream.Structure to allow introspecting Stream concatenations * Rewrite ZStream#repeatWith without recursing through ZStream#flatMap This type of recursion contains a space leak, as it introduces a new ZManaged scope for every repetition that is retained (along with all of its resources - AtomicReferences, for example) until the stream ends. * Use `Pull` where possible * Use Pull where possible
* Rewrite ZManaged#flatMap without for comprehensions * Add ZStream.Structure to allow introspecting Stream concatenations * Rewrite ZStream#repeatWith without recursing through ZStream#flatMap This type of recursion contains a space leak, as it introduces a new ZManaged scope for every repetition that is retained (along with all of its resources - AtomicReferences, for example) until the stream ends. * Use `Pull` where possible * Use Pull where possible
No description provided.