Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Damon07
Copy link
Contributor

@Damon07 Damon07 commented May 9, 2025

This PR fixes #17338, which prunes filter by using the statistics updated by the same filter. Also replacing filter returning true or null with is_not_null filter.

…eturning true or null with is_not_null filter
@Damon07 Damon07 marked this pull request as draft May 9, 2025 14:30
@Damon07 Damon07 marked this pull request as ready for review May 9, 2025 15:24
@Mytherin Mytherin merged commit a1aecb6 into duckdb:main May 10, 2025
49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@TheoristCoder
Copy link

Thanks for your fixing, appreciate it!

krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Fixes filter pruning use the statistics updated by the same filter (duckdb/duckdb#17425)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Fixes filter pruning use the statistics updated by the same filter (duckdb/duckdb#17425)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 19, 2025
Fixes filter pruning use the statistics updated by the same filter (duckdb/duckdb#17425)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 19, 2025
Fixes filter pruning use the statistics updated by the same filter (duckdb/duckdb#17425)
Mytherin added a commit that referenced this pull request Jun 26, 2025
…18018)

fixes duckdblabs/duckdb-internal#5141

#17425 introduced an optimization
to replace `constant_or_null(true, X)` with an `is_not_null(X)` filter.
This was not tested well and assumes that `X` is just a
BOUND_COLUMN_REF. If `X` is a function (for example `coalesce(X,
'value')`) then this optimization does not produce the same results.

In the coalesce example, the column X may only contain `NULLS`, but
since it is coalesced with a constant value, the `IS NOT NULL` filter
will produce a different result.

I thought about adding an `IsNotNull()` filter on top of the
ExpressionFilter when propagating the statistics
(@propagate_get.cpp:104). This turns the optimization into a
`constant_or_null` into `Is not NULL (Coalesce('X', 'value'))`. The
problem there is that `constant_or_null` can have multiple children, so
it would turn into a conjunction and with all of the children.

There is definitely still room to optimize the `constant_or_null`
function, but since this is a bug fix, I tried to keep the diff small.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong join result

3 participants