Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@jackye1995
Copy link
Contributor

@jackye1995 jackye1995 commented Dec 29, 2025

Support leveraging an index for v2 merge insert, the index optimization is added in MergeInsertPlanner to transform a normal hash join into optimized join. With this change, we can properly display query plan. A few cases:

on column fully covered by index in target:

MergeInsert: on=[id], when_matched=UpdateAll, when_not_matched=InsertAll, when_not_matched_by_source=Keep...
  CoalescePartitionsExec...
    IndexedLookup: key=id, index=id_idx
      Replay...
        StreamingTableExec: partition_sizes=1, projection=[id, value]

Index only cover some fragments in target, run a hybrid plan that unions results and then hash join:

MergeInsert: on=[id], when_matched=UpdateAll, when_not_matched=InsertAll, when_not_matched_by_source=Keep...
  CoalescePartitionsExec...
    HashJoinExec...join_type=Left...
      Replay...
        StreamingTableExec: partition_sizes=1, projection=[id, value]
      ...UnionExec...
        IndexedLookup: key=id, index=id_idx...
          Replay...
            StreamingTableExec: partition_sizes=1, projection=[id, value]
        ...LanceScan...range=None

@github-actions github-actions bot added enhancement New feature or request java labels Dec 29, 2025
@github-actions
Copy link
Contributor

Code Review: feat: support indexed v2 merge insert

Summary

This PR adds support for indexed merge insert in the v2 path, which uses scalar indices to efficiently look up target rows that match source keys instead of performing full table scans. The implementation handles both fully indexed and partially indexed datasets (where some fragments have index coverage and others don't).

P0/P1 Issues

P1: Unbounded memory usage in ReplayExec
In build_indexed_merge_physical_plan, the code uses Capacity::Unbounded for ReplayExec to avoid deadlocks with HashJoin's CollectLeft mode. While the comment explains the reasoning, this means the entire source dataset could be buffered in memory. For very large source datasets, this could cause OOM issues.

// rust/lance/src/dataset/write/merge_insert.rs
let source_replay = Arc::new(ReplayExec::new(Capacity::Unbounded, source_exec));

Consider adding documentation about this memory characteristic and/or adding a warning when source data exceeds a threshold.

P1: UpdateIf condition not properly applied
In create_indexed_action_expr, the UpdateIf case parses the condition but then ignores it:

WhenMatched::UpdateIf(condition_str) => {
    // ... parsing code ...
    if planner.create_physical_expr(&condition).is_ok() {
        // Use the matched condition combined with the parsed condition
        cases.push((matched, assign_action::Action::UpdateAll.as_literal_expr()));
    }
    // ...
}

The parsed condition should be combined with matched (e.g., matched.and(condition)) rather than being discarded. This means UpdateIf behaves the same as UpdateAll in the indexed path.

P1: Single-column key limitation may be unexpected
The indexed merge insert only supports single-column join keys, but this limitation is only enforced at runtime:

if self.params.on.len() != 1 {
    return Err(Error::invalid_input(
        "Indexed merge insert only supports single-column join keys",
        location!(),
    ));
}

Consider either:

  1. Adding this check earlier in join_key_as_scalar_index so multi-column keys fall back to the non-indexed path silently, OR
  2. Documenting this limitation in the public API

Minor Observations (Non-blocking)

  • The new batch_size parameter is added to MergeInsertParams but not exposed in MergeInsertBuilder's public API documentation.
  • Good test coverage for full, partial, and no index scenarios.
  • The hybrid execution tests cover the important edge cases well.

Overall

The implementation is solid with good architectural separation between indexed and non-indexed paths. The main concerns are the memory implications of unbounded replay and the UpdateIf semantic issue.

@jackye1995 jackye1995 force-pushed the merge-insert-indexed branch 2 times, most recently from aa2ad6d to 45a32bf Compare December 30, 2025 00:59
@codecov
Copy link

codecov bot commented Dec 30, 2025

@jackye1995 jackye1995 marked this pull request as draft December 30, 2025 08:26
@jackye1995 jackye1995 force-pushed the merge-insert-indexed branch 2 times, most recently from 97d2dc2 to c537499 Compare December 30, 2025 18:08
/// - Source columns
/// - Target columns (including `_rowid`, `_rowaddr`)
#[derive(Debug)]
pub struct IndexedLookupExec {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up wrapping this pipeline of source -> project -> MapIndexExec -> AddRowAddrExec -> TakeExec -> project as this execution node, since otherwise DF optimizer keeps adding additional repartitioning steps in between. Not sure if there is any better way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise DF optimizer keeps adding additional repartitioning steps in between

I think the DataFusion optimizer does this based on ExecutionPlan::benefits_from_input_partitioning and ExecutionPlan::required_input_distribution. It compares those settings with the children nodes input partitioning to decide whether to add repartitioning. So also worth making sure you are stating the partitioning correctly.

@jackye1995 jackye1995 marked this pull request as ready for review December 30, 2025 22:53
///
/// The source_replay must be a ReplayExec created externally. This ensures
/// the same Arc is used for both the exposed child and internal DAG.
fn build_join_pipeline_with_replay(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for fully indexed case, I ended up doing the join within this exec node so that it can more easily pass around the replay node.

@jackye1995 jackye1995 requested a review from wjones127 December 30, 2025 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants