Use trie abstractions for batch implementations #616

frankmcsherry · 2025-06-25T23:10:10Z

Our ord_neu.rs batch/trace implementations use a manual implementation of a short trie. Rather than have the work happen in bespoke methods we need to copy/paste, extract the logic out into trie "layers" which can be composed. For example, the "singleton optimization" for updates lived in four locations, with a fifth in rhh.rs. This change moves that to be one location, in the update trie layer, used by all four.

This is the first step in trying to make these types more "trie-forward", revealing their layered structure rather than living behind abstractions that conceal the structure. The goal for the moment is to get a sense for what the code looks like when you compartmentalize and modularize the logic and data. So far, pretty good!

Historically we had something similar, though it was more complicated than it needed to be. The reason seems to be that we previously had as trie layers pairs (Vec<T>, Vec<usize>) to indicate a list of keys and their offsets in the next layer. It turns out that (Vec<usize>, Vec<T>) is a better representation, with fewer cross-layer dependencies.

antiguru

Looks great; let's go!

Copilot

Pull Request Overview

This PR refactors the bespoke trie logic in ord_neu.rs into reusable “layer” abstractions (Vals, Upds, and the UpdsBuilder), and updates both the value‐batch and key‐batch implementations (and the columnar.rs example) to use these new layers instead of manual offset handling and singleton optimizations.

Introduce layers module with Vals and Upds containers and an UpdsBuilder helper
Replace manual offset arrays and singleton‐optimization code in OrdValStorage, OrdKeyStorage, and their builders
Update the columnar.rs example to use the new abstractions

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
differential-dataflow/src/trace/implementations/ord_neu.rs	Extracted trie layers into `layers::{Vals, Upds}` and rewrote `val_batch`/`key_batch` to use them
differential-dataflow/examples/columnar.rs	Updated example builders to leverage `Vals`, `Upds`, and `UpdsBuilder`

Comments suppressed due to low confidence (2)

differential-dataflow/src/trace/implementations/ord_neu.rs:161

The doc comment refers to self.vals, but this method actually uses self.offs to compute offsets into times/diffs. Update the comment to accurately describe what is being bounded (e.g., "offsets into times and diffs").

        /// Lower and upper bounds in `self.vals` of the indexed list.

differential-dataflow/src/trace/implementations/ord_neu.rs:143

[nitpick] The abbreviation Upds may be unclear to new readers. Consider renaming to a more descriptive name (e.g., Updates) to align with Vals and improve readability.

    pub struct Upds<O, T, D> {

differential-dataflow/src/trace/implementations/ord_neu.rs

frankmcsherry added 8 commits June 25, 2025 08:37

Reshape ordneu to resemble trie layers

afbed60

Update columnar example

5c671b0

Extract into module; add bounds function

daf28b2

Add indexing operators to layers

98b5de6

Move singleton update into layer

ce9c8db

Move zero push into containers

bdc34d9

Tidy comments; make members private

84a5405

Update columnar example

4960543

frankmcsherry requested review from Copilot and antiguru July 11, 2025 14:24

frankmcsherry marked this pull request as ready for review July 11, 2025 14:24

antiguru approved these changes Jul 11, 2025

View reviewed changes

Copilot AI reviewed Jul 11, 2025

View reviewed changes

differential-dataflow/src/trace/implementations/ord_neu.rs Outdated Show resolved Hide resolved

Respond to Copilot feedback

4e3cc24

frankmcsherry merged commit 5caad7b into TimelyDataflow:master Jul 11, 2025
5 checks passed

frankmcsherry deleted the re_tries branch July 11, 2025 14:43

github-actions bot mentioned this pull request Jul 11, 2025

chore: release v0.15.4 #612

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use trie abstractions for batch implementations #616

Use trie abstractions for batch implementations #616

Uh oh!

frankmcsherry commented Jun 25, 2025

Uh oh!

antiguru left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Use trie abstractions for batch implementations #616

Use trie abstractions for batch implementations #616

Uh oh!

Conversation

frankmcsherry commented Jun 25, 2025

Uh oh!

antiguru left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!