Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@svranesevic
Copy link
Contributor

@svranesevic svranesevic commented Jun 26, 2020

Replace usage of List in favor of a more performant Chunk throughout ZTransducer.collectAllN

@svranesevic svranesevic requested a review from iravid as a code owner June 26, 2020 20:46
@CLAassistant
Copy link

CLAassistant commented Jun 26, 2020

CLA assistant check
All committers have signed the CLA.

@simpadjo
Copy link
Contributor

Yes, it's very tempting to get rid of lists. Unfortunately, Chunk :+ is not as fast.
You can get around it by using ChunkBuilder or even ArrayBuilder as an accumulator.

@svranesevic
Copy link
Contributor Author

Thought about going the ChunkBuilder route, nice addition there would be giving the builder a size hint in collectAllN.

@iravid
Copy link
Member

iravid commented Jun 27, 2020

Hey @svranesevic! Thank you for following up on this.

A ChunkBuilder would be best here indeed. Appending elements to a chunk is fast, but not as fast as building with a ChunkBuilder.

@svranesevic
Copy link
Contributor Author

@iravid @simpadjo Thank you for the feedback, on it!

@svranesevic svranesevic force-pushed the ztransducer_collect_all_chunk branch from bda1b39 to bd83fe5 Compare June 27, 2020 09:32
@iravid
Copy link
Member

iravid commented Jun 27, 2020

Thanks @svranesevic, this revision looks great. The only thing we need to do is make sure the creation of the ChunkBuilder is suspended because it is mutable. You can see how that is done in ZSink.collectAll for an example.

@svranesevic
Copy link
Contributor Author

Feedback needed whether beca969 is good enough or I should rewrite ZTransducer.{collectAllWhile|collectAllWhileM} along the lines of 161616e

def collectAllN[I](n: Long): ZTransducer[Any, Nothing, I, Chunk[I]] =
ZTransducer {

def go(in: Chunk[I], builder: ChunkBuilder[I], size: Long): (ChunkBuilder[Chunk[I]], ChunkBuilder[I], Long) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can append in to builder all at once instead of appending elements one by one. It is faster and would save some lines of code as well

Copy link
Contributor Author

@svranesevic svranesevic Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I was wrapping my mind around transducer internals while writing this 😅
With that approach we need to handle case when appending the in would produce Chunk that has more elements than N by splitting the in accordingly and carrying over leftover. I will check Chunk#splitAt but afaik it should be better than current impl.

}

case Some(in) =>
stateRef.modify {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChunkBuilder could be reused after emitting a chunk, if I'm not wrong

Copy link
Contributor Author

@svranesevic svranesevic Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be reused and I almost went for it, but after giving it a thought I'd argue that it's reusability is implementation detail as ChunkBuilder is not ReusableBuilder as of right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChunkBuilder has a public clear method, so yes

@simpadjo
Copy link
Contributor

@svranesevic I have couple of ideas how to simplify implementation a bit. Let me explore it today or tomorrow.

@svranesevic
Copy link
Contributor Author

@svranesevic I have couple of ideas how to simplify implementation a bit. Let me explore it today or tomorrow.

Sure thing, looking forward to it!

@simpadjo
Copy link
Contributor

@svranesevic how about finalizing collectAllN and then rewriting other transducers in a separate PR?
They also need some polishing so let's keep the PR scope reasonable. Optimizing collectAllN and introducing sizeHint is already a good addition!

Regarding collectAllN, I suggest

  • not reallocating ChunkBuilder, use just one and clean it
  • update ChunkBuilder in a bulk way

@svranesevic
Copy link
Contributor Author

@svranesevic how about finalizing collectAllN and then rewriting other transducers in a separate PR?
They also need some polishing so let's keep the PR scope reasonable. Optimizing collectAllN and introducing sizeHint is already a good addition!

Regarding collectAllN, I suggest

  • not reallocating ChunkBuilder, use just one and clean it
  • update ChunkBuilder in a bulk way

I agree with scoping down the PR to collectAllN 👍
Regarding the suggestions:

  • I hold to my disagreement on reusing the ChunkBuilder as it is Builder for which, by the abstraction, behavior is undefined after calling the Builder#result. Reason being that we should rely on abstractions not on implementation details, so here we either bump up ChunkBuilder to be ReusableBuilder or @iravid and/or more folks hop in and vote for the reusing the ChunkBuilder as it is.
  • "update ChunkBuilder in a bulk way" - 💯

@simpadjo
Copy link
Contributor

I guess the main reason why ChunkBuilder is not a ReusableBuilder is that there are not many people who even heard about this trait :)

@iravid
Copy link
Member

iravid commented Jul 1, 2020

Admittedly reusing the ChunkBuilder is probably unsafe with streams of Any due to the use of classtags. But that's probably not a valid usecase.

In any case, let's scope this down to collectAllN like @simpadjo suggested, keep the implementation without reusing the builder and switch to bulk additions to the builder. Then we can merge 💪

@svranesevic svranesevic changed the title Replace List with Chunk in ZTransducer.{collectAllN|collectAllWhile|collectAllWhileM} Replace List with Chunk in ZTransducer.collectAllN Jul 1, 2020
*/
def collectAllN[I](n: Long): ZTransducer[Any, Nothing, I, List[I]] =
foldUntil[I, List[I]](Nil, n)((list, element) => element :: list).map(_.reverse).filter(_.nonEmpty)
def collectAllN[I](n: Int): ZTransducer[Any, Nothing, I, Chunk[I]] =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation detail that bubbled up due to the Chunk's size being Int, might be a deal breaker for this change?

Affects https://github.com/zio/zio/pull/3886/files#diff-e2c16046e05624c39e46b9e4c2870b9cR1596 and https://github.com/zio/zio/pull/3886/files#diff-e2c16046e05624c39e46b9e4c2870b9cR1603.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fair compromise, n is fine as an Int here.

suite("collectAllN")(
testM("happy path") {
assertM(run(ZTransducer.collectAllN[Int](3), List(Chunk(1, 2, 3, 4))))(equalTo(Chunk(List(1, 2, 3), List(4))))
assertM(run(ZTransducer.collectAllN[Int](3), List(Chunk(1, 2, 3, 4))))(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svranesevic Last request and then this is good to merge: the implementation is a bit more involved now, so can we add a property check? Just equivalence with List#grouped should be sufficient.

@iravid
Copy link
Member

iravid commented Jul 15, 2020

Hey @svranesevic, need any help getting this over the finish line?

@svranesevic
Copy link
Contributor Author

Hey @iravid, haven't had time lately to wrap this up, sorry about it, will do today!

@svranesevic
Copy link
Contributor Author

Ping @iravid

@iravid
Copy link
Member

iravid commented Jul 17, 2020

Looks good @svranesevic! Thanks!

@iravid iravid merged commit 598aeb6 into zio:master Jul 17, 2020
@svranesevic svranesevic deleted the ztransducer_collect_all_chunk branch July 17, 2020 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants