Thanks to visit codestin.com
Credit goes to github.com

Skip to content

foreachParN should not use more than 'n' fibers. #1165

@freskog

Description

@freskog

The current implementation of foreachParN uses a semaphore to restrict the number of active fibers working concurrently. This doesn't restrict the number of fibers actually being created, and results in some surprising behavior. For example

val rt = new DefaultRuntime {}
val n = 10
val seq = 0 to 1000000
val boom = ZIO.foreachParN[Any,Nothing, Int, Int](n)(seq)(UIO.succeed(_))
rt.unsafeRun(boom)

The expectation here would be that this works, but instead we run into an out of memory exception.

[error] (zio-default-async-268-1441532751) java.lang.OutOfMemoryError: GC overhead limit exceeded
[error] java.lang.OutOfMemoryError: GC overhead limit exceeded
[error] 	at java.lang.Integer.valueOf(Integer.java:832)
[error] 	at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:67)
[error] 	at zio.internal.PlatformLive$ExecutorUtil$.$anonfun$makeDefault$1$adapted(PlatformLive.scala:67)
[error] 	at zio.internal.PlatformLive$ExecutorUtil$$$Lambda$4633/467302532.apply(Unknown Source)
[error] 	at zio.internal.PlatformLive$ExecutorUtil$$anon$2.yieldOpCount(PlatformLive.scala:115)
[error] 	at zio.internal.FiberContext.evaluateNow(FiberContext.scala:217)
[error] 	at zio.internal.FiberContext.$anonfun$fork$1(FiberContext.scala:586)
[error] 	at zio.internal.FiberContext$$Lambda$4847/1031667563.run(Unknown Source)
[error] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[error] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[error] 	at java.lang.Thread.run(Thread.java:748)

A possible fix is to change the implementation to use a queue with a worker pool instead of using semaphores.

The general idea would be something like

  final def foreachParN[R >: LowerR, E <: UpperE, A, B](n: Long)(as: Iterable[A])(fn: A => ZIO[R, E, B]): ZIO[R, E, List[B]] =
    for {
      q     <- Queue.bounded[(Promise[E, B], A)](n.toInt)
      env   <- ZIO.environment[R]
      pairs <- ZIO.foreach(as)(a => Promise.make[E, B].map(p => (p, a))) // preserve order
      _     <- ZIO.foreach(pairs)(pair => q.offer(pair)).fork // publish work
      _     <- ZIO.collectAll(List.fill(n.toInt)(q.take.flatMap { case (p, a) => p.done(fn(a).provide(env).onError(_ => q.shutdown)) }.forever.fork)) // worker handles work
      res   <- ZIO.collectAll(pairs.map(_._1.await)).ensuring(q.shutdown) // wait for results
    } yield res

Other approaches than the above one are probably possible as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions