perf: executor use execute_parent kernel registry in execute_until#8448
perf: executor use execute_parent kernel registry in execute_until#8448joseph-isaacs wants to merge 2 commits into
execute_until#8448Conversation
Signed-off-by: Joe Isaacs <[email protected]>
Signed-off-by: Joe Isaacs <[email protected]>
| let max_iterations = max_iterations(); | ||
|
|
||
| let session = ctx.session().clone(); | ||
| let kernels = session.get::<ArrayKernels>(); |
There was a problem hiding this comment.
I would really like to get rid of this get method. The fact that it's a read that can cause a write can be very surprising
There was a problem hiding this comment.
Only once, but ye
Merging this PR will degrade performance by 18.52%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
20.3 µs | 37.7 µs | -46.09% |
| ❌ | Simulation | dict_canonicalize_zipfian[16, 1000] |
24 µs | 36.6 µs | -34.5% |
| ❌ | Simulation | canonicalize[16, 2] |
24.2 µs | 36.9 µs | -34.38% |
| ❌ | Simulation | filter_powerlaw_by_correlated_runs[1000] |
23.4 µs | 33.3 µs | -29.63% |
| ❌ | Simulation | decompress_utf8[(1000, 4)] |
38.6 µs | 49.6 µs | -22.24% |
| ❌ | Simulation | take_struct_sequential_indices |
50.6 µs | 63.3 µs | -20.04% |
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
176.9 µs | 215.6 µs | -17.96% |
| ❌ | Simulation | search_index_mixed_out_of_range |
256.3 µs | 312 µs | -17.85% |
| ❌ | Simulation | search_index_above_max |
256.2 µs | 311.8 µs | -17.83% |
| ❌ | Simulation | search_index_below_min |
256.2 µs | 311.8 µs | -17.83% |
| ❌ | Simulation | search_index_full_range_random |
256.6 µs | 312.1 µs | -17.8% |
| ❌ | Simulation | search_index_in_range |
256.7 µs | 312.3 µs | -17.79% |
| ❌ | Simulation | decompress_rd[f64, (100000, 0.0)] |
845.4 µs | 1,025.7 µs | -17.58% |
| ❌ | Simulation | decode_bool_nullable[10000_10_alternating_mostly_valid] |
59.1 µs | 71.1 µs | -16.82% |
| ❌ | Simulation | i32_small_overlapping |
40 µs | 47.3 µs | -15.53% |
| ❌ | Simulation | take_fsl_nullable_random[16, 100] |
70.3 µs | 83.2 µs | -15.46% |
| ❌ | Simulation | decompress_rd[f32, (100000, 0.0)] |
499.1 µs | 588.2 µs | -15.15% |
| ❌ | Simulation | encode_varbin[(1000, 2)] |
158.2 µs | 184.1 µs | -14.07% |
| ❌ | Simulation | chunked_varbinview_opt_into_canonical[(100, 100)] |
409.8 µs | 472.6 µs | -13.29% |
| ❌ | Simulation | chunked_bool_canonical_into[(10, 1000)] |
755.3 µs | 865.8 µs | -12.76% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ji/perf-executor-execute-parent (bb4f00f) with develop (e1c6ef5)
Footnotes
-
429 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.045x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.045x ➖, 0↑ 2↓)
No file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.081x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.009x ➖, 1↑ 1↓)
datafusion / parquet (1.038x ➖, 0↑ 2↓)
duckdb / vortex-file-compressed (1.100x ➖, 0↑ 4↓)
duckdb / vortex-compact (1.047x ➖, 0↑ 0↓)
duckdb / parquet (1.057x ➖, 0↑ 2↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.989x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.989x ➖, 0↑ 0↓)
datafusion / parquet (0.969x ➖, 1↑ 0↓)
datafusion / arrow (0.984x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.982x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.986x ➖, 0↑ 0↓)
duckdb / parquet (1.015x ➖, 0↑ 4↓)
duckdb / duckdb (0.988x ➖, 0↑ 0↓)
File Size Changes (11 files changed, +0.0% overall, 6↑ 5↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.995x ➖, 3↑ 1↓)
datafusion / vortex-compact (0.995x ➖, 2↑ 1↓)
datafusion / parquet (0.993x ➖, 1↑ 2↓)
duckdb / vortex-file-compressed (1.001x ➖, 2↑ 2↓)
duckdb / vortex-compact (0.996x ➖, 2↑ 1↓)
duckdb / parquet (1.000x ➖, 0↑ 0↓)
duckdb / duckdb (0.996x ➖, 1↑ 0↓)
File Size Changes (7 files changed, -0.0% overall, 2↑ 5↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.041x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.207x ➖, 0↑ 3↓)
datafusion / parquet (1.028x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.893x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.039x ➖, 0↑ 1↓)
duckdb / parquet (0.975x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.021x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.010x ➖, 0↑ 0↓)
duckdb / parquet (1.006x ➖, 0↑ 0↓)
File Size Changes (1 files changed, +0.0% overall, 1↑ 0↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 0.949x ➖ How to read Verdict and Engines
unknown / unknown (1.005x ➖, 3↑ 1↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.992x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.997x ➖, 0↑ 0↓)
datafusion / parquet (0.999x ➖, 0↑ 0↓)
datafusion / arrow (0.975x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.999x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.993x ➖, 0↑ 0↓)
duckdb / parquet (1.003x ➖, 0↑ 0↓)
duckdb / duckdb (0.989x ➖, 0↑ 0↓)
File Size Changes (27 files changed, +0.0% overall, 13↑ 14↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.015x ➖, 0↑ 3↓)
datafusion / parquet (1.003x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.005x ➖, 1↑ 1↓)
duckdb / parquet (0.998x ➖, 1↑ 0↓)
duckdb / duckdb (1.005x ➖, 0↑ 0↓)
File Size Changes (104 files changed, +0.0% overall, 54↑ 50↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.941x ➖, 2↑ 1↓)
datafusion / vortex-compact (0.977x ➖, 0↑ 2↓)
datafusion / parquet (1.357x ❌, 1↑ 13↓)
duckdb / vortex-file-compressed (0.910x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.902x ➖, 0↑ 0↓)
duckdb / parquet (0.925x ➖, 0↑ 0↓)
|
Benchmarks: CompressionVortex (geomean): 1.005x ➖ How to read Verdict and Engines
unknown / unknown (0.998x ➖, 0↑ 0↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.024x ➖, 0↑ 0↓)
datafusion / parquet (0.960x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.011x ➖, 0↑ 0↓)
duckdb / parquet (1.000x ➖, 0↑ 0↓)
duckdb / duckdb (0.977x ➖, 0↑ 0↓)
File Size Changes (4 files changed, -0.0% overall, 1↑ 3↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.030x ➖, 0↑ 2↓)
datafusion / vortex-compact (0.936x ➖, 0↑ 0↓)
datafusion / parquet (1.093x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (0.867x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.902x ➖, 0↑ 0↓)
duckdb / parquet (0.932x ➖, 0↑ 1↓)
|
As part of #8445 we need to use execute_parent kernels for in
execute_until.This PR does that change