Make Chunk Extend IndexedSeq #3342

adamgfraser · 2020-04-12T12:26:23Z

Makes Chunk extend IndexedSeq from the Scala standard library. This will improve interoperability with other code and pave the way for us to unify the Iterable and Chunk versions of a variety of combinators.

Because of changes to Scala's collection library in 2.13, this is implemented in terms of a new version specific trait ChunkLike that Chunk extends. ChunkLike extends the appropriate traits for each versions and implements the corresponding builder interface so that code in Chunk can continue to be written on a cross-version basis.

Potential issues:

I think we are going to have to give up on empty ++ nonempty returning a nonempty chunk because we now have another ++ variant on IndexedSeq. We could address by picking another operator for combining two chunks that doesn't clash to get back the current behavior.
A few methods had to be move to ChunkLike due to version specific differences. In some cases Scala 2.13 provides a signature that matches ours because it doesn't include CanBuildFrom but Scala 2.12 doesn't. So we need to override in one case but not the other. In other cases methods are defined as final in IndexedSeq in Scala 2.13 (e.g. size is defined in terms of length) so we can't override them.
ChunkBuiler implementation could be optimized further. It is already backed by an ArrayBuilder that has specialized implementations for primitives but we could override the ++= method for increased efficiency.
Chunk#fold has to be renamed Chunk#foldLeft. I think this is fine and it probably should have been this way all along but it is a breaking change.

iravid

Love this. Great idea.

Had one question about the inference regression, but other than that I'm 👍 on this!

iravid · 2020-04-12T13:57:00Z

core-tests/shared/src/test/scala/zio/ChunkSpec.scala

      c.foreach(_ => ())

-      assert(c.filter(_ => false).map(_ * 2).length)(equalTo(0))
+      assert(c.filter(_ => false).map[Int](_ * 2).length)(equalTo(0))


Could you explain why the annotations are required on map here?

The issue is that we now have two overloaded versions of map on Scala 2.11 and 2.12:

// From Chunk def map[B](f: A => B): Chunk[B] // from IndexedSeq def map[B, That](f: A => B)(implicit bf: CanBuildFrom[List[A], B, That]): That

So in the presence of that the Scala compiler, especially on 2.11 has a harder time inferring these types and complains that the type of an anonymous function should be fully known (even though it seems like it should be inferable here).

I think to address we need to consider moving methods like this to ChunkLike and only having one implementation on Scala 2.11 / Scala 2.12 that uses the signature from IndexedSeq but has an efficient implementation.

Ah, I see. Will we able to retain the efficient implementation given that signature? We'd have to go through the CBF and won't be able to use an array directly.

That's the question. I think we can. We can match on the CBF to get back a ChunkBuilder and add methods on the ChunkBuilder to support an efficient implementation in that case.

Ah yeah. That sounds viable!

Worth writing up in tickets so we don't forget!

I am incorporating a solution in this PR. Will push an updated version shortly.

jdegoes · 2020-04-12T20:52:08Z

core-tests/shared/src/test/scala/zio/ChunkSpec.scala

      check(chunkWithIndex(Gen.unit)) {
        case (chunk, i) =>
-          assert(chunk.apply(i))(equalTo(chunk.toSeq.apply(i)))
+          assert(chunk.apply(i))(equalTo(chunk.toList.apply(i)))


We can omit the whole .toList.and use the apply directly on Chunk.

Ah never mind, it's testing that it's the same as List implementation.

I think the goal in this test was to make sure the apply method itself is accurate, and so we are comparing the apply implementation in Chunk with the known correct value from accessing the same index in the list. I suppose maybe it would be better to check against Vector here.

jdegoes · 2020-04-12T20:54:00Z

core-tests/shared/src/test/scala/zio/ChunkSpec.scala

+    //   val empty: Chunk[B] = Chunk.empty

-      val _: NonEmptyChunk[A] = empty ++ Chunk(new A {})
+    //   val _: NonEmptyChunk[A] = empty ++ Chunk(new A {})


Dead code in light of the changes to ++? Here and above.

Yes, I think we are going to have to add a new operator with a different name, something like concatNonEmpty.

jdegoes

Tricky and delicate work here; love the attention to detail.

This exciting work is going to make Chunk the go-to collection type!

adamgfraser · 2020-04-13T02:28:48Z

This is ready for another review. I added a ChunkCanBuildFrom subtype of CanBuildFrom. We can pattern against this to prove that the target type That is a Chunk, allowing us to use all of our existing efficient implementations of Chunk combinators.

jdegoes · 2020-04-13T12:08:03Z

core/shared/src/main/scala-2.11-2.12/ChunkLike.scala

+    while (i < len) {
+      val chunk = f(self(i)) match {
+        case chunk: Chunk[B] => chunk
+        case other           => Chunk.fromIterable(other.toList)


This toList will be detrimental to performance. Probably best to do Chunk.fromArray(other.toArray) if we need to do that.

Will change.

We can't call toArray here because we don't have a ClassTag at this point. May make sense to refactor this so we just iterate over the original collection and either get the class tag from the Chunk if it is a chunk or otherwise from the first value of the collection using Tag.fromValue.

Right. I think if we can only iterate over it once, then the best we can do is use the chunk builder to turn the "iterable once" into a chunk.

OTOH if we match against a few other cases, e.g. Vector, List, etc., we know what collection type we are dealing with and can handle it more specially (e.g. Vector we can preallocate, List cannot be pre-allocated since it's O(n) to determine the size, etc.).

But open question how much of that performance work to do here. I'm fine merging this in now, we can create tickets for performance work.

I've got an idea about how we can do this efficiently. Will do a turn on it now.

jdegoes · 2020-04-13T12:08:28Z

core/shared/src/main/scala-2.11-2.12/ChunkLike.scala

+   * and is not referentially transparent. It is provided for compatibility
+   * with Scala's collection library and should not be used for other purposes.
+   */
+  override protected[this] def newBuilder: ChunkBuilder[A] =


By convention, side-effecting methods should have nullary parameter list: def newBuilder().

Yes. The signature of the newBuilder method we are overriding doesn't have the nullary parameter list. Let me see if I can add and still have it be recognized as a valid override.

Yes unfortunately adding the nullary parameter list generate a compiler error so I think we will have to leave as is.

jdegoes · 2020-04-13T12:11:57Z

core/shared/src/main/scala-2.13+/ChunkBuilder.scala

+  /**
+   * Constructs a new `ChunkBuilder`.
+   */
+  def make[A]: ChunkBuilder[A] =


Suggested change

def make[A]: ChunkBuilder[A] =

def make[A](capacity: Int = 10): ChunkBuilder[A] =

I saw you could call ensureSize on ArrayBuilder if you sub class it. Which would allow "pre-allocation" in cases where it's known about how many things will be added.

Yes, I think we want to override the sizeHint method. Because these methods will mostly be called by Scala collection library combinators that are outside our control.

jdegoes · 2020-04-13T12:12:50Z

core/shared/src/main/scala-2.13+/ChunkLike.scala

+    while (i < len) {
+      val chunk = f(self(i)) match {
+        case chunk: Chunk[B] => chunk
+        case other           => Chunk.fromIterable(other.iterator.to(List))


Chunk.fromArray(...toArray) if possible.

If not possible maybe we should directly make a Chunk.fromIterator or something since going "through" List will not be efficient.

I think adding that method is a good idea.

jdegoes · 2020-04-13T12:13:14Z

core/shared/src/main/scala-2.13+/ChunkLike.scala

+   * Constructs a `Chunk` from the specified `IterableOnce`.
+   */
+  def from[A](source: IterableOnce[A]): Chunk[A] =
+    Chunk.fromIterable(source.iterator.to(Iterable))


jdegoes · 2020-04-13T12:13:39Z

project/BuildHelper.scala

              CrossType.Full.sharedSrcDir(baseDirectory.value, "test").toList.map(f => file(f.getPath + "-2.12+")),
-              CrossType.Full.sharedSrcDir(baseDirectory.value, "main").toList.map(f => file(f.getPath + "-dotty"))
+              CrossType.Full.sharedSrcDir(baseDirectory.value, "main").toList.map(f => file(f.getPath + "-dotty")),
+              CrossType.Full.sharedSrcDir(baseDirectory.value, "main").toList.map(f => file(f.getPath + "-2.13+"))


Sbt's not going to like this. 😆

I was surprised this worked as smoothly as it did!

jdegoes

Excellent work! A few minor tips to improve performance, but overall looks great.

adamgfraser · 2020-04-13T14:17:27Z

Done.

extend IndexedSeq

27a84c8

adamgfraser requested review from mijicd and softinio as code owners April 12, 2020 12:26

adamgfraser added 2 commits April 12, 2020 09:25

fix Dotty compatibility issue

4bfb45a

fix Scala 2.11 compatibility issues

930f286

iravid reviewed Apr 12, 2020

View reviewed changes

adamgfraser added 2 commits April 12, 2020 11:57

fix build failures

baf9b07

format

8527eb6

jdegoes reviewed Apr 12, 2020

View reviewed changes

jdegoes previously approved these changes Apr 12, 2020

View reviewed changes

implement ChunkCanBuildFrom

a12268b

adamgfraser dismissed jdegoes’s stale review via a12268b April 13, 2020 00:00

adamgfraser added 2 commits April 12, 2020 20:46

resolve merge conflicts

fe1b65b

enable test

c5765b4

adamgfraser requested review from iravid and jdegoes April 13, 2020 02:28

jdegoes reviewed Apr 13, 2020

View reviewed changes

jdegoes previously approved these changes Apr 13, 2020

View reviewed changes

optimize

8aa1e47

adamgfraser dismissed jdegoes’s stale review via 8aa1e47 April 13, 2020 13:39

adamgfraser requested a review from jdegoes April 13, 2020 13:59

adamgfraser added 2 commits April 13, 2020 10:15

use isEmpty

29b3daf

remove singleton case

61b9e62

jdegoes approved these changes Apr 13, 2020

View reviewed changes

jdegoes merged commit e72fa8a into zio:master Apr 13, 2020

ioleo mentioned this pull request Apr 19, 2020

Publishing broken since #3342 #3410

Closed

adamgfraser deleted the chunk branch April 22, 2020 01:59

	def make[A]: ChunkBuilder[A] =
	def make[A](capacity: Int = 10): ChunkBuilder[A] =

Uh oh!

Make Chunk Extend IndexedSeq #3342

Make Chunk Extend IndexedSeq #3342

Uh oh!

Conversation

adamgfraser commented Apr 12, 2020

Uh oh!

iravid left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdegoes left a comment

Choose a reason for hiding this comment

Uh oh!

adamgfraser commented Apr 13, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamgfraser Apr 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdegoes left a comment

Choose a reason for hiding this comment

Uh oh!

adamgfraser commented Apr 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

adamgfraser Apr 13, 2020 •

edited

Loading