Port Join operator to WorkProcessor model #1256
Conversation
There was a problem hiding this comment.
I think we should start by adding support for additional info to io.prestosql.operator.WorkProcessorOperator.
Similarly for spill. Otherwise we would get regression feature wise.
2f017e8 to
92216f4
Compare
|
Could you add support for operator info since #1292 landed |
|
Sure will update it. |
2f504b8 to
3709b4d
Compare
There was a problem hiding this comment.
I don't see reason why Optional.of(statisticsCounter) would fail here
269017e to
51363e7
Compare
There was a problem hiding this comment.
it seems Session should be also part of ProcessorContext
There was a problem hiding this comment.
static import TransformationState
There was a problem hiding this comment.
What is the difference between JoinProcessor and JoinTransformation
There was a problem hiding this comment.
I think you still want to update spillInProgress here and validate that previous spill not failed.
This method is different than the previous one.
There was a problem hiding this comment.
this code this method originated from is quite complex. What if tryUnspillNext behaved as the old method, that is call restoreProbe? This way it would make it easier to be sure that nothing critical changed behind the scenes.
There was a problem hiding this comment.
I think you also want tests in TestWorkProcessorPipelineQueries
365c2af to
329006a
Compare
sopel39
left a comment
There was a problem hiding this comment.
comments, remember to add tests to io.prestosql.tests.TestWorkProcessorPipelineQueries
There was a problem hiding this comment.
session argument is redundant here
There was a problem hiding this comment.
extract result of needsInput() if used more than once
There was a problem hiding this comment.
but we can't re-use it as a same probe Page can give multiple pages in case of mxn joins, in that case we might need to reuse that same Page
There was a problem hiding this comment.
if you called addInput(element); then you need to return needsMoreData here
There was a problem hiding this comment.
add an extra boolean: boolean consumedInput and set it to true here
There was a problem hiding this comment.
consumedInput should be set to true here. The page has been fully consumed after addInput call
There was a problem hiding this comment.
return TransformationState.ofResult(outputPage, consumedInput);
7685c43 to
6b538ad
Compare
There was a problem hiding this comment.
consumedInput should be set to true here. The page has been fully consumed after addInput call
There was a problem hiding this comment.
I think it should be:
public TransformationState<Page> process(@Nullable Page element) {
if (element == null) {
finish();
}
boolean isFinished = isFinished();
if (isFinished()) {
return finished();
}
ListenableFuture<?> blocked = isBlocked();
if (!blocked.isDone()) {
return blocked(blocked);
}
boolean consumedInput = false;
if (needsInput()) {
addInput(element);
consumedInput = true;
}
Page outputPage = getOutput();
if (outputPage != null) {
return ofResult(outputPage, consumedInput && !isFinished());
}
if (consumedInput) {
return needsMoreData();
}
return yield();
}
There was a problem hiding this comment.
remove whitespace lineitem.partkey = part.partkey
125d025 to
5776365
Compare
There was a problem hiding this comment.
move below benchmarkBuildHash and make it static
There was a problem hiding this comment.
you miss the test for the other benchmark
There was a problem hiding this comment.
ping, you still miss tests for other benchmark
sopel39
left a comment
There was a problem hiding this comment.
Minor comments. I will run benchmarks and then if there are no regressions and objections from other people, I will merge it
There was a problem hiding this comment.
don't use ? condition, but rather explicit if.
Making estimatedProbeBlockBytes 0 means we might produce too large pages when they are materialized.
Please add a check in io.prestosql.operator.LookupJoinPageBuilder#isFull that the number of positions do not exceed PageProcessor#MAX_BATCH_SIZE.
Please add a TODO that we should estimate probe block bytes for lazy pages too.
e62a1a7 to
f89984a
Compare
There was a problem hiding this comment.
ping, you still miss tests for other benchmark
There was a problem hiding this comment.
Nope, the lock is still retained by the LookupJoinOperatorFactory
There was a problem hiding this comment.
run benchmarks twice here to make sure build side is not released
290e3a0 to
17863d6
Compare
17863d6 to
4e6f894
Compare
|
My cross check benchmarks. |
|
combined results: |
|
Another benchmark ( There are some cases when BEFORE or AFTER prevails. I investigated more and that might be some JIT noise since I've seen some good and bad runs depending how JVM started. Let's move forward with this. |
TO-DO:
The benchmark that we took