[experiment] Layout Reader#8518
Conversation
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Checkpoint of in-progress V2 ScanNode work (segment scheduling driver, scheduled segment source, scan scheduler) so agent fixes can be integrated on a clean base. Reviewed/benchmarked state. Signed-off-by: Nicholas Gates <[email protected]>
The scan2 StructScanNode single-field fast paths (single get_item and single-referenced-field expressions) routed straight to the child scan node, bypassing the parent struct's validity mask. Projecting one field out of a nullable struct therefore returned the child's own values and validity with no parent null mask applied, producing wrong nulls (and a non-nullable result where a nullable one was expected). Mirror the v1 struct reader's `array.mask(validity)` behaviour: add a small MaskScanNode that reads an input value and the struct's non-nullable boolean validity child and produces `mask(input, validity)`. Wrap the single-field fast-path results in MaskScanNode when the struct is nullable. The full push_struct path already threads validity through StructValueScanNode, so it is unchanged. Add a V1-vs-V2 differential test harness in vortex-file that scans the same ScanRequest through both paths and asserts equality across flat (nullable + non-nullable), chunked, dict-encoded, zoned, and nested nullable-struct fixtures, plus ports of the v1 struct-null regression tests (test_struct_layout_nulls / test_struct_layout_nested) to the V2 path. Before the fix the five nested-nullable-struct cases failed with "expected i32?, actual i32"; after the fix all 18 cases pass. Signed-off-by: Nicholas Gates <[email protected]> Co-Authored-By: Claude Opus 4.8 <[email protected]>
…filter-first Port of the V1 multi-conjunct filter behavior to the V2 PartitionWorkScheduler driver: (1) sort filter conjuncts cheapest-first in PreparedScanNodeFile::try_new so expensive residuals (e.g. FSST LIKE) run after cheap selective ones; (2) when the demanded-row density falls below EXPR_EVAL_THRESHOLD (0.2), read the residual predicate with selection=need so the leaf returns the compacted array and the expression evaluates over only the demanded rows, scattering the verdict back via Mask::intersect_by_rank. Adds V1-vs-V2 differential cases (low- and high-density multi-conjunct) and a predicate_cost unit test. Improves ClickBench multi-conjunct filters (q22 701->547ms, q23 now < V1). A separate single-LIKE FSST amplification (q21) remains and is tracked separately. Signed-off-by: Nicholas Gates <[email protected]> Co-Authored-By: Claude Opus 4.8 <[email protected]>
V2 parallelizes the join probe, aggregate, and Arrow decode ACROSS DataFusion partitions (V1 instead fans one partition into many split tasks). When a query projected a heavily-encoded column (e.g. a single RunEnd chunk for lineitem.l_orderkey), the opener fed split_aligned_row_range coarse chunk boundaries, which collapsed every byte-range file_group onto one partition and serialized the probe ~2-wide (TPC-H q4 ran 2.6x slower than V1). Feed split_aligned_row_range the scan's own morsel ranges instead: the read-column chunk hints, or the 100k-row fallback when a read column is a single chunk (mirroring PreparedScanNodeFile::splits). Each morsel lands wholly in one partition, so the scan spreads across all of DataFusion's byte-range file_groups with no collapse and no chunk straddling a partition boundary. The assignment is contiguous per partition, so it is correct even when the scan output must preserve order. Also run the Vortex->Arrow conversion on the runtime CPU pool (handle.spawn_cpu + buffered/buffer_unordered) so decode fans out within a partition rather than running serially on the consumer poll thread. TPC-H SF1 (datafusion-bench, VORTEX_SCAN_IMPL=v2): q4 goes from 2.6x slower than V1 to faster than V1; overall ~parity. Signed-off-by: Nicholas Gates <[email protected]> Co-Authored-By: Claude Opus 4.8 <[email protected]>
…H_FULL_PLAN With --show-metrics and VORTEX_BENCH_FULL_PLAN=1, print the DataFusion EXPLAIN ANALYZE-style annotated plan (elapsed_compute / output_rows per operator) to stderr, to localize where wall time goes across scan, HashJoin build/probe, and aggregate. Signed-off-by: Nicholas Gates <[email protected]> Co-Authored-By: Claude Opus 4.8 <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Rename the runtime scan node API to ScanPlan and move the plan and segment primitives into vortex-scan. Layout v2 now expands directly through layout.new_scan_plan with a plan ScanRequest, and the docs describe the v2 path as the layout scan model. Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Polar Signals Profiling ResultsLatest Run
Previous Runs (6)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals Profiling (base)Vortex (geomean): 1.059x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.059x ➖, 2↑ 4↓)
No file size changes detected. |
Benchmarks: FineWeb NVMe (base)Verdict: Likely regression (high confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.686x ❌, 1↑ 8↓)
datafusion / parquet (0.943x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.909x ➖, 3↑ 3↓)
duckdb / parquet (0.953x ➖, 1↑ 0↓)
File Size Changes (3 files changed, -46.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.001x ➖, 4↑ 6↓)
datafusion / parquet (1.051x ➖, 0↑ 2↓)
datafusion / arrow (1.070x ➖, 1↑ 6↓)
duckdb / vortex-file-compressed (1.036x ➖, 0↑ 1↓)
duckdb / parquet (1.043x ➖, 0↑ 5↓)
File Size Changes (17 files changed, -44.5% overall, 4↑ 13↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.977x ➖, 9↑ 5↓)
datafusion / parquet (0.980x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.966x ➖, 10↑ 3↓)
duckdb / parquet (0.991x ➖, 1↑ 0↓)
File Size Changes (30 files changed, -43.4% overall, 3↑ 27↓)
Totals:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
Benchmarks: Clickbench on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.986x ➖, 7↑ 4↓)
datafusion / parquet (1.005x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.955x ➖, 10↑ 4↓)
duckdb / parquet (0.996x ➖, 0↑ 0↓)
File Size Changes (201 files changed, -39.1% overall, 55↑ 146↓)
Totals:
|
Benchmarks: FineWeb S3 (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.869x ➖, 1↑ 0↓)
datafusion / parquet (0.979x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.851x ➖, 1↑ 0↓)
duckdb / parquet (0.996x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.940x ➖, 1↑ 2↓)
datafusion / parquet (0.995x ➖, 0↑ 0↓)
datafusion / arrow (1.007x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.028x ➖, 1↑ 2↓)
duckdb / parquet (1.009x ➖, 0↑ 1↓)
File Size Changes (47 files changed, -44.5% overall, 7↑ 40↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3 (base)Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.193x ➖, 1↑ 9↓)
datafusion / parquet (0.995x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (1.036x ➖, 0↑ 0↓)
duckdb / parquet (0.976x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population Genetics (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.122x ❌, 0↑ 3↓)
duckdb / parquet (1.025x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -32.3% overall, 0↑ 3↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 1.069x ➖ How to read Verdict and Engines
unknown / unknown (1.038x ➖, 0↑ 8↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.206x ❌, 1↑ 21↓)
datafusion / vortex-compact (1.275x ❌, 1↑ 21↓)
datafusion / parquet (1.248x ❌, 0↑ 21↓)
datafusion / arrow (1.278x ❌, 0↑ 22↓)
duckdb / vortex-file-compressed (1.248x ❌, 0↑ 21↓)
duckdb / vortex-compact (1.268x ❌, 0↑ 22↓)
duckdb / parquet (1.119x ❌, 0↑ 13↓)
duckdb / duckdb (1.138x ❌, 0↑ 18↓)
File Size Changes (27 files changed, +4.6% overall, 27↑ 0↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.044x ➖, 7↑ 16↓)
datafusion / parquet (1.003x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.886x ✅, 23↑ 0↓)
duckdb / parquet (0.987x ➖, 2↑ 0↓)
duckdb / duckdb (0.963x ➖, 0↑ 0↓)
File Size Changes (105 files changed, +16.6% overall, 102↑ 3↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.228x ➖, 1↑ 8↓)
datafusion / vortex-compact (1.224x ➖, 1↑ 12↓)
datafusion / parquet (1.190x ➖, 1↑ 9↓)
duckdb / vortex-file-compressed (1.075x ➖, 1↑ 2↓)
duckdb / vortex-compact (1.119x ➖, 0↑ 2↓)
duckdb / parquet (1.162x ➖, 0↑ 4↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.015x ➖, 0↑ 0↓)
datafusion / parquet (1.009x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.061x ➖, 0↑ 2↓)
duckdb / parquet (1.046x ➖, 0↑ 1↓)
duckdb / duckdb (1.027x ➖, 0↑ 0↓)
File Size Changes (4 files changed, +1.7% overall, 2↑ 2↓)
Totals:
|
Benchmarks: CompressionVortex (geomean): 1.123x ❌ How to read Verdict and Engines
unknown / unknown (1.117x ❌, 0↑ 21↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.314x ❌, 0↑ 10↓)
datafusion / vortex-compact (1.244x ➖, 0↑ 9↓)
datafusion / parquet (1.174x ➖, 1↑ 8↓)
duckdb / vortex-file-compressed (1.165x ➖, 1↑ 9↓)
duckdb / vortex-compact (1.184x ➖, 1↑ 7↓)
duckdb / parquet (1.121x ➖, 0↑ 3↓)
|
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Remove the filter-through-slice reduction that expands projection masks into child-domain masks, and keep the scan scheduler changes that make task reads explicit and budgeted by read dependencies. Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Signed-off-by: Nicholas Gates <[email protected]>
Optimize V2 flat slice results before returning them from scan reads so slices over constants collapse back to constants before Arrow export. Also avoid running expensive VarBinView buffer compaction analysis for dense byte-view exports while keeping sparse retained-buffer exports compact. Signed-off-by: Nicholas Gates <[email protected]>
What feels like the 27th time I've explored this space, I think I might finally be getting somewhere.
This design pulls out essentially a scan engine. Layouts are actually just one way take serialized arrays and construct a ScanPlan, but in theory we could build a ScanPlan by hand or by any other means.
A ScanPlan node can accept push-down of various operations:
This plan can then be used to answer different types of questions:
[more description to come]