Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@artjomPlaunov
Copy link
Contributor

@artjomPlaunov artjomPlaunov commented Nov 24, 2025

Follow up to #19477, fix for https://github.com/duckdblabs/duckdb-internal/issues/6613

The previous PR added support for buffering and replaying WAL index deletes, however that introduced a memory over-allocation issue, as the UnboundIndex was storing a vector of BufferedIndexData, which stored each buffered operation in a ColumnDataCollection. This was extremely wasteful because if there are interleavings (insert -> delete -> ...) a single operation to be replayed would be stored in a ColumnDataCollection with an internal allocation of STANDARD_VECTOR_SIZE.

EDIT: See @Mytherin's comment below, this PR fixes the issue by changing the way buffering works, now we use two buffers, one for inserts, and another for deletes. Since the inserts and deletes may be interleaved, however, we need an additional vector data structure that stores replay operations and their intervals within the respective buffer. This all stored in BufferedIndexReplays within UnboundIndex.

Buffering data is much simpler now, as we can just append directly to either the insert or delete ColumnDataCollection, as well as appending a ReplayRange node (or extending the range of the last node, if the replay operation is the same type of operation).

Replaying is more efficient now, as we now maintain two interleaved scans on the respective contiguous ColumnDataCollections, fetching one DataChunk at a time to replay.

@artjomPlaunov artjomPlaunov force-pushed the unbound-index-allocations branch from 65dca39 to 795dcef Compare November 24, 2025 13:44
@artjomPlaunov artjomPlaunov force-pushed the unbound-index-allocations branch from 795dcef to 9e8ded0 Compare November 24, 2025 14:59
@taniabogatsch taniabogatsch self-requested a review November 26, 2025 10:11
Copy link
Contributor

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Looks great! I just left a bunch of nits and then this is ready to go in from my side. :)

Copy link
Contributor

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments / questions.

@artjomPlaunov artjomPlaunov force-pushed the unbound-index-allocations branch from dc66093 to b8cc6e8 Compare November 26, 2025 16:11
@artjomPlaunov artjomPlaunov force-pushed the unbound-index-allocations branch from b8cc6e8 to bfe0aad Compare November 26, 2025 16:16
Copy link
Contributor

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more comments from my side! Let's run CI? :)

@artjomPlaunov
Copy link
Contributor Author

Yep, thanks for the review!

@artjomPlaunov artjomPlaunov marked this pull request as ready for review November 27, 2025 09:55
@Mytherin
Copy link
Collaborator

Thanks for the PR!

Perhaps a simpler and more efficient solution here could be to share the ColumnDataCollection between all BufferedIndexData insert / delete nodes. We really only store two different collections:

  • Insert data, holding new data to be inserted
  • Delete data, holding row ids to be deleted

We could have two separate ColumnDataCollection nodes for these, and have each BufferedIndexData refer to a range within the ColumnDataCollection. These ranges will then always be consecutive. For example, if we have the following operations:

INSERT INTO tbl VALUES (2);
DELETE FROM tbl WHERE rowid=1;
COMMIT;

INSERT INTO tbl VALUES (3);
DELETE FROM tbl WHERE rowid=2;
COMMIT;

INSERT INTO tbl VALUES (4);
DELETE FROM tbl WHERE rowid=3;
COMMIT;

We would have the following collections:

InsertCollection

i: [2, 3, 4]

DeleteCollection

rowids: [1, 2, 3]

With the following nodes:

BufferedIndexData
    type: INSERT
    start: 0
    end: 1

BufferedIndexData
    type: DELETE
    start: 0
    end: 1
    
BufferedIndexData
    type: INSERT
    start: 1
    end: 2

BufferedIndexData
    type: DELETE
    start: 1
    end: 2

BufferedIndexData
    type: INSERT
    start: 2
    end: 3

BufferedIndexData
    type: DELETE
    start: 2
    end: 3

This has a number of advantages:

  • We only need to scan two ColumnDataCollections, and we do so in-order, so this will all be memory-adjacent and efficient
  • Constructing these will also be efficient, as we're not constantly allocating tiny batches
  • I think we will likely end up using less memory in most cases, as the (wasted) empty space is capped to the empty space in the two ColumnDataCollections - versus having potentially much more (wasted) empty space spread across different chunks and collections
  • From a code perspective I think this might also be simpler and easier to test fully - given we don't have as many special cases as the proposed solution here adds

@artjomPlaunov artjomPlaunov marked this pull request as draft November 27, 2025 13:13
@artjomPlaunov
Copy link
Contributor Author

Thanks @Mytherin that's a great idea, going to rewrite it!

@artjomPlaunov artjomPlaunov force-pushed the unbound-index-allocations branch from 5647eb0 to e9376c6 Compare November 27, 2025 19:13
Copy link
Contributor

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, looking so shiny now haha - left a few comments. :)

@artjomPlaunov
Copy link
Contributor Author

@taniabogatsch Thank you for the review! I will run the CI now

@artjomPlaunov artjomPlaunov marked this pull request as ready for review December 1, 2025 12:09
@artjomPlaunov artjomPlaunov marked this pull request as draft December 1, 2025 12:14
@artjomPlaunov artjomPlaunov marked this pull request as ready for review December 1, 2025 12:21
@Mytherin
Copy link
Collaborator

Mytherin commented Dec 1, 2025

Looks great, thanks for the changes!

@pdet pdet merged commit 52fe0d2 into duckdb:v1.4-andium Dec 1, 2025
62 checks passed
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Dec 1, 2025
[Art][Wal]Unbound index allocations (duckdb/duckdb#19901)
Null assertion on denormalized_table argument (duckdb/duckdb#19947)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Dec 1, 2025
[Art][Wal]Unbound index allocations (duckdb/duckdb#19901)
Null assertion on denormalized_table argument (duckdb/duckdb#19947)

Co-authored-by: krlmlr <[email protected]>
@artjomPlaunov artjomPlaunov deleted the unbound-index-allocations branch December 30, 2025 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants