Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@myazinn
Copy link
Contributor

@myazinn myazinn commented Jul 13, 2023

TL;DR here're benchmark results

[info] Benchmark                                                     (chunkSize)   Mode  Cnt          Score     Error   Units
[info] StreamBenchmarks.FromIteratorSucceedCurr                                1  thrpt   10         30.104 ±   0.561   ops/s
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate                 1  thrpt   10       3582.986 ±  66.872  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate.norm            1  thrpt   10  124809074.882 ± 211.221    B/op
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.count                      1  thrpt   10        680.000            counts
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.time                       1  thrpt   10        384.000                ms
[info] StreamBenchmarks.FromIteratorSucceedCurr                               10  thrpt   10        156.871 ±   0.567   ops/s
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate                10  thrpt   10       3513.820 ±  12.655  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate.norm           10  thrpt   10   23488097.528 ± 476.264    B/op
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.count                     10  thrpt   10        666.000            counts
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.time                      10  thrpt   10        358.000                ms
[info] StreamBenchmarks.FromIteratorSucceedCurr                              100  thrpt   10        311.327 ±   1.309   ops/s
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate               100  thrpt   10       3801.647 ±  15.922  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.alloc.rate.norm          100  thrpt   10   12804561.492 ± 220.695    B/op
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.count                    100  thrpt   10        719.000            counts
[info] StreamBenchmarks.FromIteratorSucceedCurr:·gc.time                     100  thrpt   10        406.000                ms
[info] StreamBenchmarks.FromIteratorSucceedNew                                 1  thrpt   10         55.778 ±   0.923   ops/s
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate                  1  thrpt   10       4111.381 ±  68.025  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate.norm             1  thrpt   10   77292211.067 ± 297.569    B/op
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.count                       1  thrpt   10        783.000            counts
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.time                        1  thrpt   10        413.000                ms
[info] StreamBenchmarks.FromIteratorSucceedNew                                10  thrpt   10        407.991 ±   2.188   ops/s
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate                 10  thrpt   10       3981.714 ±  21.330  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate.norm            10  thrpt   10   10233648.792 ±  53.680    B/op
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.count                      10  thrpt   10        754.000            counts
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.time                       10  thrpt   10        416.000                ms
[info] StreamBenchmarks.FromIteratorSucceedNew                               100  thrpt   10       1653.623 ±   6.383   ops/s
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate                100  thrpt   10       4897.999 ±  18.928  MB/sec
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.alloc.rate.norm           100  thrpt   10    3105935.255 ±   1.144    B/op
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.count                     100  thrpt   10        926.000            counts
[info] StreamBenchmarks.FromIteratorSucceedNew:·gc.time                      100  thrpt   10        592.000                ms

New version is several times faster and puts less pressure on GC.
Here's the benchmark. I can add it to MR, but I'm not sure where to add it. zio.StreamBenchmarks ?

import org.openjdk.jmh.annotations.{Scope, _}
import zio._
import zio.stream._

import java.util.concurrent.TimeUnit

@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.Throughput))
@OutputTimeUnit(TimeUnit.SECONDS)
@Fork(1)
@Measurement(iterations = 10, timeUnit = TimeUnit.SECONDS, time = 5)
@Warmup(iterations = 5, timeUnit = TimeUnit.SECONDS, time = 5)
@Threads(1)
class StreamBenchmarks {

  @Param(Array("1", "10", "100"))
  var chunkSize: Int = _

  private val data = 1 to 1024

  private val runtime = Runtime.default

  private def replicate(i: Int) = Iterator.fill(64)(i)

  @Benchmark
  def FromIteratorSucceedCurr = unsafeRun {
    ZStream.fromIterable(data).flatMap { i =>
      fromIteratorSucceedCurr(replicate(i), chunkSize)
    }.runDrain
  }

  @Benchmark
  def FromIteratorSucceedNew = unsafeRun {
    ZStream.fromIterable(data).flatMap { i =>
      fromIteratorSucceedNew(replicate(i), chunkSize)
    }.runDrain
  }

  private def unsafeRun[A](zio: Task[A]): A = Unsafe.unsafe(implicit unsafe => runtime.unsafe.run(zio).getOrThrow())

  private def fromIteratorSucceedCurr[A](iterator: => Iterator[A], maxChunkSize: => Int): ZStream[Any, Nothing, A] =
    ZStream.fromIteratorSucceed(iterator, maxChunkSize)

  private def fromIteratorSucceedNew[A](iterator: => Iterator[A], maxChunkSize: => Int)(implicit trace: Trace): ZStream[Any, Nothing, A] = {
    def writeOneByOne(iterator: Iterator[A]): ZChannel[Any, Any, Any, Any, Nothing, Chunk[A], Any] =
      if (iterator.hasNext)
        ZChannel.write(Chunk.single(iterator.next())) *> writeOneByOne(iterator)
      else
        ZChannel.unit

    def writeChunks(iterator: Iterator[A]): ZChannel[Any, Any, Any, Any, Nothing, Chunk[A], Any] =
      ZChannel.succeed(ChunkBuilder.make[A]()).flatMap { builder =>
        def loop(iterator: Iterator[A]): ZChannel[Any, Any, Any, Any, Nothing, Chunk[A], Any] = {
          builder.clear()
          var count = 0
          while (count < maxChunkSize && iterator.hasNext) {
            builder += iterator.next()
            count += 1
          }
          if (count > 0)
            ZChannel.write(builder.result()) *> loop(iterator)
          else
            ZChannel.unit
        }

        loop(iterator)
      }

    ZStream.fromChannel {
      ZChannel.suspend {
        if (maxChunkSize == 1)
          writeOneByOne(iterator)
        else
          writeChunks(iterator)
      }
    }
  }
}

The original intention was to optimize ZStream.apply, since it unexpectedly works quite slow for seemingly harmless convenience method. And since ZStream.apply relies on ZStream.fromIteratorSucceed, this MR changes the latter one.
I know that in real world these changes will be most likely unnoticeable, but IMO it still worth merging. I've found because I was comparing performance of ZIO Streams to Akka streams and fs2, and was surprised by how slow ZIO Streams were, until it was was found that this particular method was causing degradation.

@CLAassistant
Copy link

CLAassistant commented Jul 13, 2023

CLA assistant check
All committers have signed the CLA.

@myazinn myazinn force-pushed the optimize_ZStream.fromIteratorSucceed branch from ba60c2d to 47cb9e2 Compare July 13, 2023 18:48
Copy link
Contributor

@adamgfraser adamgfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! 🙏

@adamgfraser adamgfraser merged commit b12849c into zio:series/2.x Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants