Use Chunk.single rather than Chunk.apply [series/2.0.x] #8718

ghostdogpr · 2024-04-09T06:28:32Z

Constructing a Chunk of size 1 is faster using Chunk.single rather than Chunk.apply which uses Chunk.fromIterable. Found the one coming from interruptAsFork while profiling an app:

core/shared/src/main/scala/zio/NonEmptyChunk.scala

kyri-petrou · 2024-04-09T06:51:27Z

On second thought, would it work if Chunk.apply was overloaded to accept a single element? Or would that not be binary-compatible?

  def apply[A](as: A): Chunk[A] =
    single(as)

ghostdogpr · 2024-04-09T07:06:29Z

On second thought, would it work if Chunk.apply was overloaded to accept a single element? Or would that not be binary-compatible?
  def apply[A](as: A): Chunk[A] =
    single(as)

Mima seems to be happy, so I will open another PR.

ghostdogpr · 2024-04-09T07:18:29Z

@kyri-petrou it didn't work with 2.12 so reopening this one

jdegoes · 2024-04-09T09:28:26Z

@ghostdogpr

Can I instead suggest the existing .apply constructor call nonEmpty on the sequence, to decide internally whether to call Chunk.single? Then the API does not need to be carefully used and it does not need to be expanded.

This reverts commit 86c9993.

This reverts commit bb25739.

ghostdogpr · 2024-04-09T11:24:42Z

Actually apply is probably going to be called only with very small values so it's okay to check the size. Added this:

override def apply[A](as: A*): Chunk[A] =
    if (as.size == 1) single(as.head) else fromIterable(as)

kyri-petrou · 2024-04-09T11:29:44Z

Actually apply is probably going to be called only with very small values so it's okay to check the size. Added this:
override def apply[A](as: A*): Chunk[A] =
    if (as.size == 1) single(as.head) else fromIterable(as)

I didn't know this until now, but the Seq created by def apply[A](as: A*) is actually an ArraySeq and has a knownSize defined. So calling .size should have an O1 complexity. In case you want to try it out:

object Foo {
  def apply(vs: String*): Unit = {
    println(vs)
    println(vs.knownSize)
  }

  @main def m = Foo("asd", "asd", "")
}

erikvanoosten · 2024-04-17T17:20:01Z

What I have always understood from my Java time is that a vararg method is actually implemented with arrays. Now scala presents this as a Seq, but presumably it actually is still an array at runtime. According to https://users.scala-lang.org/t/storage-backing-varargs/1997 we can get the underlying array with toArray. With that in place the more efficient implementation would be:

override def apply[A](as: A*): Chunk[A] =
    if (as.size == 1) single(as.head) else fromArray(as.toArray)

WDYT?

erikvanoosten · 2024-04-17T17:37:27Z

but presumably it actually is still an array at runtime

Hmm, after reading some more, I am no longer sure this is true 🤔

ghostdogpr · 2024-04-17T23:10:49Z

Looks like we get an ArraySeq so I think you're right that toArray is free. It would rely on internal implementation details that could change though, but I think it's okay.

erikvanoosten · 2024-04-18T06:55:51Z

Looks like we get an ArraySeq ...

Nope, I was wrong:

scala> def foo(as: String*) = { println(as.getClass) }
def foo(as: String*): Unit

scala> foo("a","b")
class scala.collection.immutable.ArraySeq$ofRef  // all fine so far

scala> foo(Vector("a","b")*)
class scala.collection.immutable.Vector1   // but not here

We could specialize for ArraySeq though...

kyri-petrou · 2024-04-18T08:56:16Z

I wonder if it's worth bringing in a dependency on scala-collection-compat. It's very widely used, and it allows for a lot of optimizations to be done by having sizeCompare available on iterables.

erikvanoosten · 2024-04-18T09:38:02Z

scala>   def bar(as: String*) =
     |     if (as.isInstanceOf[scala.collection.immutable.ArraySeq[?]]) "array!"
     |     else "iterable"
def bar(as: String*): String

scala> bar("a", "b")
val res0: String = array!

scala> bar(Vector("a", "b")*)
val res1: String = iterable

This works, so a good Chunk.apply implementation could be:

  override def apply[A](as: A*): Chunk[A] =
    if (as.size == 1) single(as.head)
    else if (as.isInstanceOf[scala.collection.immutable.ArraySeq[?]]) fromArray(as.toArray)
    else fromIterable(as)

Note sure how to test the performance of this though.

kyri-petrou · 2024-04-18T11:51:15Z

@erikvanoosten I think there are 2 separate things here:

Applying an optimization for cases that as is an ArraySeq.
Making sure that the as.size doesn't iterate through a large collection.

For (1), I think we probably ignore optimizations for this case since the number of elements that will be passed are likely to be very few, since they're usually written in this kind of fashion: Chunk(1,2,3)

Now (2) is indeed a bit dangerous (performance-wise), and we should probably try and work something out. The most dangerous case I can think of is this one:

val someList = List.fill(10_000)("foo")
val chunk = Chunk(someList*)

In this case, the Seq that is passed to the apply(as: A*) method will be a List, and the call to size will need to iterate through the entire list to determine its size

erikvanoosten · 2024-04-18T13:12:23Z

@kyri-petrou You are right, since you can call vargarg methods with any Seq (due to the * operator), so you can not be sure if size needs to traverse the collection.

So the optimization in this PR is questionable.

For (1), I think we probably ignore optimizations for this case since the number of elements that will be passed are likely to be very few, since they're usually written in this kind of fashion: Chunk(1,2,3)

If we're here for micro-optimizations: my intuition says that the extra if statement will be faster than an extra allocation.

We can fix (2) by using knownSize, I kept the if statement for (1):

  override def apply[A](as: A*): Chunk[A] =
    if (as.knownSize == 1) single(as.head)
    else if (as.isInstanceOf[scala.collection.immutable.ArraySeq[?]]) fromArray(as.toArray)
    else fromIterable(as)

kyri-petrou · 2024-04-18T13:30:37Z

@erikvanoosten problem with knownSize is that it's not available in Scala 2.12. Thus why I recommended this #8718 (comment)

erikvanoosten · 2024-04-18T13:51:40Z

@erikvanoosten problem with knownSize is that it's not available in Scala 2.12. Thus why I recommended this #8718 (comment)

Ah, now I understand why you wanted this.

I believe IndexedSeq does exist in 2.12. So we could also check against that:

  override def apply[A](as: A*): Chunk[A] =
    if (as.isInstanceOf[scala.collection.immutable.IndexedSeq[?]] && as.size == 1) single(as.head)
    else if (as.isInstanceOf[scala.collection.immutable.ArraySeq[?]]) fromArray(as.toArray)
    else fromIterable(as)

kyri-petrou · 2024-04-18T14:50:14Z

@erikvanoosten this feels like a workaround for something that there is already a solution for - which is to bring in scala-collection-compat as a dependency. I know it's not ideal to bring in a dependency to a project that has none, but most of zio-* projects already depend on it. For Scala 2.13 and Scala 3, the dependency doesn't bring much code in (none for 3, almost non for 2.13)

The reason I'm advocating to bring it in is because I had to run through similar loops in zio-query myself until I decided to add it as a dependency. At the end, we're very likely going to miss out on optimizations if we don't do it. As an example, the implementation of foreachParX currently uses .size, when a simple sizeCompare would be much cheaper for Lists:

zio/core/shared/src/main/scala/zio/ZIO.scala

Lines 5979 to 5986 in 4a29804

    
           private def foreachParDiscard[R, E, A]( 
        
             n: => Int 
        
           )(as0: => Iterable[A])(f: A => ZIO[R, E, Any])(implicit trace: Trace): ZIO[R, E, Unit] = 
        
             ZIO.suspendSucceed { 
        
               val as   = as0 
        
               val size = as.size 
        
               if (size == 0) ZIO.unit 
        
               else if (size == 1) f(as.head).unit

ghostdogpr · 2024-04-18T14:54:16Z

Sounds okay to me to depend on scala-collection-compat. @jdegoes any opinion?

jdegoes · 2024-04-18T18:18:37Z

I think it's okay, but .nonEmpty / .isEmpty is sufficient for many cases where the class is not known. And if the class is known (via type casing), you don't need size cause you can use .length.

Use Chunk.single rather than Chunk.apply

8b3571e

ghostdogpr mentioned this pull request Apr 9, 2024

Use Chunk.single rather than Chunk.apply [series/2.x] #8719

Merged

Fix

ce00a03

kyri-petrou reviewed Apr 9, 2024

View reviewed changes

core/shared/src/main/scala/zio/NonEmptyChunk.scala Outdated Show resolved Hide resolved

Improve

547b36c

ghostdogpr closed this Apr 9, 2024

ghostdogpr deleted the chunk-2.0 branch April 9, 2024 07:06

ghostdogpr restored the chunk-2.0 branch April 9, 2024 07:16

ghostdogpr reopened this Apr 9, 2024

Fix NonEmptyChunk.single

cb6c4c2

guizmaii approved these changes Apr 9, 2024

View reviewed changes

ghostdogpr and others added 5 commits April 9, 2024 18:39

Optimize fromIterable

bb25739

Fix

86c9993

Revert "Fix"

12c5602

This reverts commit 86c9993.

Revert "Optimize fromIterable"

08785fc

This reverts commit bb25739.

Optimize Chunk.apply

b779e03

jdegoes merged commit 582270a into zio:series/2.0.x Apr 9, 2024

ghostdogpr deleted the chunk-2.0 branch April 9, 2024 23:42

Use Chunk.single rather than Chunk.apply [series/2.0.x] #8718

Use Chunk.single rather than Chunk.apply [series/2.0.x] #8718

Uh oh!

Conversation

ghostdogpr commented Apr 9, 2024

Uh oh!

Uh oh!

kyri-petrou commented Apr 9, 2024

Uh oh!

ghostdogpr commented Apr 9, 2024

Uh oh!

ghostdogpr commented Apr 9, 2024

Uh oh!

jdegoes commented Apr 9, 2024

Uh oh!

ghostdogpr commented Apr 9, 2024

Uh oh!

kyri-petrou commented Apr 9, 2024

Uh oh!

erikvanoosten commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikvanoosten commented Apr 17, 2024

Uh oh!

ghostdogpr commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikvanoosten commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyri-petrou commented Apr 18, 2024

Uh oh!

erikvanoosten commented Apr 18, 2024

Uh oh!

kyri-petrou commented Apr 18, 2024

Uh oh!

erikvanoosten commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyri-petrou commented Apr 18, 2024

Uh oh!

erikvanoosten commented Apr 18, 2024

Uh oh!

kyri-petrou commented Apr 18, 2024

Uh oh!

ghostdogpr commented Apr 18, 2024

Uh oh!

jdegoes commented Apr 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

erikvanoosten commented Apr 17, 2024 •

edited

Loading

ghostdogpr commented Apr 17, 2024 •

edited

Loading

erikvanoosten commented Apr 18, 2024 •

edited

Loading

erikvanoosten commented Apr 18, 2024 •

edited

Loading