Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@adamgfraser
Copy link
Contributor

Following up on discussion on Discord, this PR improves the efficiency of operations on concatenated chunks. The basic problem is that right now almost all chunk operations are implemented in terms of accessing each index of the chunk. Indexed access is relatively efficient because of balanced concatenation and using underlying arrays, but because the data structure is not actually a flat array indexed access is not nearly as efficient as direct iteration, resulting in terrible performance when iterating over concatenated chunks.

We can address this in many cases by just using foreach instead of accessing each index. For example, here is a comparison with Chunk and other data types for map and foldLeft on concatenated chunks (created by repeatedly concatenating 1,000 chunks of 1,000 elements each). The first set of results is for the current implementation. The second set of results is for a fully materialized chunk. The third set of results is using foreach instead of indexed access.

The current implementation is orders of magnitude slower than other collections versus just using foreach is as fast or faster than other collections. Right now I have only done this for map and foldLeft but I think we need to go through and make sure no operations on Chunk, as opposed to the Arr subtype are doing indexed access and instead are all iterating using foreach or iterator.

Current Chunk

[info] Benchmark                            Mode  Cnt          Score         Error  Units
[info] ChainBenchmarks.foldLeftLargeChain   avgt    5    7394179.877 ± 2400522.803  ns/op
[info] ChainBenchmarks.foldLeftLargeChunk   avgt    5  152564062.429 ± 3453030.056  ns/op
[info] ChainBenchmarks.foldLeftLargeList    avgt    5    4015942.091 ±  297741.459  ns/op
[info] ChainBenchmarks.foldLeftLargeVector  avgt    5    8167982.583 ±  968383.806  ns/op
[info] ChainBenchmarks.mapLargeChain        avgt    5   12212584.229 ± 1688755.027  ns/op
[info] ChainBenchmarks.mapLargeChunk        avgt    5  154922038.625 ± 3936423.265  ns/op
[info] ChainBenchmarks.mapLargeList         avgt    5   10110147.776 ± 2062606.153  ns/op
[info] ChainBenchmarks.mapLargeVector       avgt    5    6808899.183 ±  861057.070  ns/op

Materialized Chunk

[info] Benchmark                            Mode  Cnt         Score         Error  Units
[info] ChainBenchmarks.foldLeftLargeChain   avgt    5   6954098.879 ±  110454.185  ns/op
[info] ChainBenchmarks.foldLeftLargeChunk   avgt    5   4920782.436 ±  597549.698  ns/op
[info] ChainBenchmarks.foldLeftLargeList    avgt    5   4542490.441 ± 1471651.374  ns/op
[info] ChainBenchmarks.foldLeftLargeVector  avgt    5   8054089.991 ±  244988.236  ns/op
[info] ChainBenchmarks.mapLargeChain        avgt    5  10515048.826 ±  701132.081  ns/op
[info] ChainBenchmarks.mapLargeChunk        avgt    5   4884269.707 ±   54435.301  ns/op
[info] ChainBenchmarks.mapLargeList         avgt    5  11663339.345 ± 2032071.560  ns/op
[info] ChainBenchmarks.mapLargeVector       avgt    5   6777173.856 ±  515598.245  ns/op

Iteration Instead of Indexed Access

[info] Benchmark                            Mode  Cnt         Score         Error  Units
[info] ChainBenchmarks.foldLeftLargeChain   avgt    5   7669070.957 ± 2548209.118  ns/op
[info] ChainBenchmarks.foldLeftLargeChunk   avgt    5   7239030.095 ±  675284.379  ns/op
[info] ChainBenchmarks.foldLeftLargeList    avgt    5   4289477.864 ±  185167.759  ns/op
[info] ChainBenchmarks.foldLeftLargeVector  avgt    5   8532388.157 ±  223626.941  ns/op
[info] ChainBenchmarks.mapLargeChain        avgt    5  12319347.622 ± 3136106.030  ns/op
[info] ChainBenchmarks.mapLargeChunk        avgt    5   5711082.159 ±  565630.752  ns/op
[info] ChainBenchmarks.mapLargeList         avgt    5  11124514.568 ± 1431372.665  ns/op
[info] ChainBenchmarks.mapLargeVector       avgt    5   7067995.506 ±  376664.129  ns/op

The current implementation is orders of magnitude slower than other collection types

@adamgfraser
Copy link
Contributor Author

Here are the results with the arrayIterator. Slightly slower on the map benchmark but faster on the foldLeft benchmark. Also more flexible in terms of being able to do early termination.

[info] Benchmark                            Mode  Cnt         Score         Error  Units
[info] ChainBenchmarks.foldLeftLargeChain   avgt    5   7255480.568 ± 2393479.461  ns/op
[info] ChainBenchmarks.foldLeftLargeChunk   avgt    5   5214968.659 ±  481081.956  ns/op
[info] ChainBenchmarks.foldLeftLargeList    avgt    5   3989461.345 ±  299550.184  ns/op
[info] ChainBenchmarks.foldLeftLargeVector  avgt    5   8339002.565 ±  185454.856  ns/op
[info] ChainBenchmarks.mapLargeChain        avgt    5  11596649.953 ± 4502893.811  ns/op
[info] ChainBenchmarks.mapLargeChunk        avgt    5   5967337.337 ±  381246.181  ns/op
[info] ChainBenchmarks.mapLargeList         avgt    5  10109670.576 ± 2640786.665  ns/op
[info] ChainBenchmarks.mapLargeVector       avgt    5   6662389.019 ±  623701.259  ns/op

@iravid
Copy link
Member

iravid commented Jun 19, 2020

Really nice. The arrayIterator approach will also incur less code changes :-)

@adamgfraser adamgfraser requested a review from iravid June 19, 2020 19:39
Copy link
Member

@iravid iravid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥🔥

@adamgfraser adamgfraser merged commit b5f1830 into zio:master Jun 19, 2020
@adamgfraser adamgfraser deleted the chunk branch July 27, 2020 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants