Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ranma42
Copy link
Contributor

@ranma42 ranma42 commented Sep 16, 2025

This is an alternative implementation of #36797.

I added a test which I think might make sense regardless, as it checks that the handling of nullable bools does not regress, at least in trivial cases (specifically, that test would fail on EFCore 8 and pass on 9).

The main difference in the approach when compared to #36797 is that instead of a field in the visitor, the allowOptimizedExpansion information is passed while visiting, just like the inSearchConditionContext.

The allowOptimizedExpansion is named like this to match the same value in the SqlNullabilityProcessor as it has the same semantics, namely that "falsy" results (NULLs, FALSEs) can be clumped together.

This was broken on EFCore 8 and works as intended in EFCore 9.
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this @ranma42... A few comments:

  • Ths approach here seems close to what I described in the OP of the other PR, which attempts to more narrowly identify cases where the bitwise transformation is problematic, rather than removing it whenever we're in a predicate.
  • I can see the sense of doing that, but I'm wary of making assumptions about what the SQL Server query planner does and doesn't do (after all, this whole fix is because we were assuming it wouldn't see into CASE - which was my own assumption as well, and I think it was reasonable even if incorrect...).
    • For example, IIUC you pass allowOptimizedExpansion: false to the operands of SqlBinaryExpression, meaning that if the CASE is nested inside AND, we'd be doing the bitwise transformation within THEN (hopefully I got that right). I'm not sure I want EF to go too far into trying to model/guess the SQL Server query planner/optimizer.
  • Importantly: what's the downside of doing the broader thing as in my PR? IIUC the original bitwise transformation was done to fix bugs when the value is projected out of the database; but in my PR we only prevent the transformation in predicates, which never get projected out. So what do you think we're potentially losing by having more CASE constructs instead of bitwise? If we don't have a clear idea, I'd personally rather default to CASE (in order words, in the absence of specific data it seems less risky to me to have CASE instead of bitwise rather than the other way around).
  • Regardless of all the above, in this PR I personally find the name allowOptimizedExpansion confusing here; I get the idea of aligning to SqlNullabilityProcessor, but in that visitor there's a single, clear meaning of "optimized expression": contexts where false and null are equivalent. Here, on the other, allowOptimizedExpansion currently seems to mean "don't transform to bitwise", which seems odd (at the very least it seems like it should start with "require" or "disallow", rather than "allow")... At least until we get other "optimizations", my preference would be a name that expresses more closely what that variable actually means/controls (e.g. flip the logic and call it allowBitwiseEqualityTransformation; or if you prefer your own approach, maybe at least rename to inPossibleIndexUsageContext).

Let me know what you think of all of the above! We have a couple more days before on of these must be merged for 10.

@ranma42
Copy link
Contributor Author

ranma42 commented Sep 17, 2025

Thanks for submitting this @ranma42... A few comments:

  • Ths approach here seems close to what I described in the OP of the other PR, which attempts to more narrowly identify cases where the bitwise transformation is problematic, rather than removing it whenever we're in a predicate.

The idea is basically the same :)

  • I can see the sense of doing that, but I'm wary of making assumptions about what the SQL Server query planner does and doesn't do (after all, this whole fix is because we were assuming it wouldn't see into CASE - which was my own assumption as well, and I think it was reasonable even if incorrect...).

Yes, that was definitely an invalid assumption. I would love to have something like https://www.sqlite.org/optoverview.html for SqlServer, but I am afraid it does not exist; also, I guess SqlServer has changed behavior over time regarding planning/query optimizations and I believe it might also be very complex and dependent on the live data statistics... so probably not something that EFCore can promptly rely on.

  • For example, IIUC you pass allowOptimizedExpansion: false to the operands of SqlBinaryExpression, meaning that if the CASE is nested inside AND, we'd be doing the bitwise transformation within THEN (hopefully I got that right). I'm not sure I want EF to go too far into trying to model/guess the SQL Server query planner/optimizer.

Sorry, allowOptimizedExpansion is indeed a very bad and even misleading name. Currently we have some operations with 2 different translations, one which preserves NULLs and another one which folds them into 0 AS bit (along with false values).
(Unfortunately some cases are not yet handled properly, see AND/OR, for example).

  • Importantly: what's the downside of doing the broader thing as in my PR? IIUC the original bitwise transformation was done to fix bugs when the value is projected out of the database; but in my PR we only prevent the transformation in predicates, which never get projected out. So what do you think we're potentially losing by having more CASE constructs instead of bitwise? If we don't have a clear idea, I'd personally rather default to CASE (in order words, in the absence of specific data it seems less risky to me to have CASE instead of bitwise rather than the other way around).

The risk is also around predicates, because it is legitimate to compare the result of a Boolean expression.

A test that would emit a different query and a different set of results (for .NET9/this PR vs #36797) is the following one:

    [ConditionalFact]
    public virtual void Where_not_equal_using_relational_null_semantics_complex_in_equals()
    {
        using var context = CreateContext(useRelationalNulls: true);
        context.Entities1
            .Where(e => (e.NullableBoolA != e.NullableBoolB) == e.NullableBoolC)
            .Select(e => e.Id).ToList();
    }

useRelationalNulls feels a bit like cheating, but I guess we should still avoid regressing it (also, the same could be achieved with UDF or other (less self-contained) tricks... but that is probably even less impactful for users).

Comparison of the queries (and results) of the current(/this PR) behavior vs #36797, also available at https://dbfiddle.uk/DJ2muxpD

.NET9 / #36809

SELECT *
FROM [Entities1] AS [e]
WHERE [e].[NullableBoolA] ^ [e].[NullableBoolB] = [e].[NullableBoolC]
NullableBoolA NullableBoolB NullableBoolC
False False False
False True True
True False True
True True False

#36797

SELECT *
FROM [Entities1] AS [e]
WHERE CASE
    WHEN [e].[NullableBoolA] <> [e].[NullableBoolB] THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END = [e].[NullableBoolC]
NullableBoolA NullableBoolB NullableBoolC
null null False
null False False
null True False
False null False
False False False
False True True
True null False
True False True
True True False
  • Regardless of all the above, in this PR I personally find the name allowOptimizedExpansion confusing here; I get the idea of aligning to SqlNullabilityProcessor, but in that visitor there's a single, clear meaning of "optimized expression": contexts where false and null are equivalent. Here, on the other, allowOptimizedExpansion currently seems to mean "don't transform to bitwise", which seems odd (at the very least it seems like it should start with "require" or "disallow", rather than "allow")... At least until we get other "optimizations", my preference would be a name that expresses more closely what that variable actually means/controls (e.g. flip the logic and call it allowBitwiseEqualityTransformation; or if you prefer your own approach, maybe at least rename to inPossibleIndexUsageContext).

The name is definitely bad, sorry, I didn't ally think much about it yesterday.
Today I took some time to review it, your original name, your suggestion, a few other ideas (and cheated a little bit and asked an LLM for recommendations 😈 ):

  • allowOptimizedExpansion is opaque and possibly misleading; it does not help in making it clear what is going on here
  • inLargerPredicateContext represent the state in a straightforward way, but provides little insight in what is going on (when should it be resets, taken into account, ...)
  • allowBitwiseEqualityTransformation is weird to me: the bitwise transformation is safe even though it can cause inefficiencies; in a sense, it should always be allowed, but we might have to requireBitwiseEqualityTransformation when we know that the result is used in a context where NULLs matter (projections, predicates in which this value is compared to NULLs)
  • preserveNulls might convey the meaning better, although it might still leave the question of what happens to non-preserved NULL values
  • the LLM suggested preserveFalseNullDistinction, allowNullFalseEquivalence, and strictBooleanSemantics (plus a bunch of weird stuff, which I will omit for brevity, as not all of them were really clear 💩 ).

With all of these options, I can definitely see that allowOptimizedExpansion was a very bad choice on my part.
I think preserveFalseNullDistinction or allowNullFalseEquivalence (depending on the "sign" which we prefer) is reasonably clear (maybe a bit long, but for example allowNullFalseEquivalence is just 2 keystrokes more than allowOptimizedExpansion and conveys a much more explicit meaning).

Let me know what you think of all of the above! We have a couple more days before on of these must be merged for 10.

EDIT: added the queries/results for Where_not_equal_using_relational_null_semantics_complex_in_equals

@roji
Copy link
Member

roji commented Sep 17, 2025

Thanks for all the discussion (as always), and absolutely don't worry about allowOptimizedExpansion naming :)

I have to step out, but just to be sure... Are we aware of any specific issues (either bugs or performance issues) that my broader #36797 would have/create, which this PR would fix? In other words, I'm trying to be sure whether we're aware of any specific advantage between the two approach (aside from the purely simpler SQL here). At least in my current state of mind, if - as far as we know - the two are equivalent (bug- and perf-wise), I think I'd still prefer my broader approach, simply because it feels like there's more chance of some construct out there which would use the index when in the predicate with my PR and not with this one (in the same way that occured in #36291).

Otherwise I'll think more about your comments and respond more in detail tomorrow! Thanks again.

When NULLs and FALSEs are treated as equivalent, the result set includes records
with some `NULL`fields; when they are interpreted following the usual relational
semantics, the result set only includes the records:

| NullableBoolA | NullableBoolB | NullableBoolC |
| :-------------| :-------------| :-------------|
| False         | False         | False         |
| False         | True          | True          |
| True          | False         | True          |
| True          | True          | False         |
@ranma42
Copy link
Contributor Author

ranma42 commented Sep 18, 2025

Thanks for all the discussion (as always), and absolutely don't worry about allowOptimizedExpansion naming :)

As I was force-updating the PR, I took the chance to also replace the name with allowNullFalseEquivalence (I am totally open to replacing it with other options, but I wanted to step away from allowOptimizedExpansion and aim towards a better name).

I have to step out, but just to be sure... Are we aware of any specific issues (either bugs or performance issues) that my broader #36797 would have/create, which this PR would fix?

I updated the Where_not_equal_using_relational_null_semantics_complex_in_equals added in this PR and currently it:

I am unsure if it is considered relevant, as that test relies on useRelationalNulls. OTOH it does not actually depend on that for anything deep, it just makes it easier to get the AST the triggers the issue to the SearchConditionConverter.

In other words, I'm trying to be sure whether we're aware of any specific advantage between the two approach (aside from the purely simpler SQL here). At least in my current state of mind, if - as far as we know - the two are equivalent (bug- and perf-wise), I think I'd still prefer my broader approach, simply because it feels like there's more chance of some construct out there which would use the index when in the predicate with my PR and not with this one (in the same way that occured in #36291).

Note that the tests I added only check for correctness (which result set is returned), not for performance.
I believe in most cases we should require correct results even in the face of slower queries, but I can definitely understand that regressing the performance of working queries in order to fix broken-but-unused queries might be a bad tradeoff.

I am not completely sure what is the right path for this; I definitely believe we can get the best of both worlds (efficient and correct queries) and I hope that this PR achieves that (barring known issues that are currently not being tackled).
On a broader perspective, I am afraid the current test tooling is not very effective in detecting when a change in a query leads to a worse execution plan, but I believe it might be possible to assert both on queries and plans... it might make sense for tests which are geared towards ensuring query performance (most tests are still going to be concerned with query correctness).

Otherwise I'll think more about your comments and respond more in detail tomorrow! Thanks again.

Thank you for taking the time to look into this. I am afraid some parts of this are still not 100% clear because of the remaining glitches around nullable Booleans/BITs in SqlServer, but I am very happy to see that this is an area that is being improved upon 🚀

@ranma42 ranma42 marked this pull request as ready for review September 19, 2025 05:29
@ranma42 ranma42 requested a review from a team as a code owner September 19, 2025 05:29
@ranma42
Copy link
Contributor Author

ranma42 commented Sep 19, 2025

IIUC the deadline is approaching, so I'll try to do a brief recap.

The EFCore provider for SqlServer sometimes translates nullable boolean expressions in a "lossy" way, that folds both NULL and FALSE into 1 as bit; this is sometimes incorrect (for example in projections) and tracked in #34001.
EFCore 9 fixed some of these cases, most notably (in)equality comparisons involving "int-like" data type (implemented in #34168). The approach implemented in #34168 always relies on bitwise operators (xor aka ^ and not aka ~), which causes serious performance regressions, as seen in #36291.

The two changesets from #36797 and #36809 are attempts to address the performance regression, which mainly differ in when the (in)equality is transformed into a ^.

#36797 uses the lossy translation whenever the comparison is within a predicate (in the current SELECT scope); outside of predicates, the comparison is translated as in EFCore 9.

#36809 uses the lossy translation whenever the comparison is used in an expression that (it is known that it) does not distinguish NULL and FALSE; in all other cases, the comparison is translated as in EFCore 9.

Assuming no further bug is being introduced in either case, I would consider:

This choice involve a tradeoff for which I will defer to @roji and other EFCore maintainers.
I believe both options are better than the status quo, but unfortunately neither is 100% satisfactory.

I still have plans to work more on #34001 and I believe that by providing additional nullability information to this step of the pipeline, a translation that is both correct and as efficient as expected could be implemented... but that will require a few additional intermediate changes, definitely not something for this RC (not this major).

@roji
Copy link
Member

roji commented Sep 19, 2025

@ranma42 sorry I didn't get around to looking at this more yesterday, and thanks for the recap. I just spent some time thinking about this, and my understanding corresponds to the summary you just posted.

Specifically, thanks for putting together a scenario which fails (correctness) with my broader PR, but passes with yours. I agree there's really no 100% satisfactory answer here, and I have a nagging doubt that there may be additional perf issues which would exist with your PR but not with mine... That's especially painful since performance issues like this are pretty hard for users to spot and narrow down (as in #36291).

However, we're at a point where my PR still has a known correctness issue (the scenario you added in Where_not_equal_using_relational_null_semantics_complex_in_equals), whereas your PR doesn't have a known problem (either correctness or performance). So I'm going to go ahead and merge your PR for 10 rather than mine, and we can always revisit this again in the future.

Thanks again (as always) for your investigation and insights!

On a broader perspective, I am afraid the current test tooling is not very effective in detecting when a change in a query leads to a worse execution plan, but I believe it might be possible to assert both on queries and plans...

Note #23125 which tracks adding a get-the-query-plan feature in EF; once that's done we could also assert on it in tests. As with SQL baselines, that would mainly help us avoid regressing performance, for cases where we originally suspected that there's a possible query performance concern and turned on the plan assertion. In other words, I somewhat doubt that this would have helped us catch this particular perf regression (though it might have).

@roji
Copy link
Member

roji commented Sep 19, 2025

@artl93 @SamMonoRT I am merging this for RC2 in place of #36797, which has already been approved; #36797 and this PR do very similar changes, fix the same bug and the same servicing template notes apply to both. So to save time I'll go ahead and "transfer" the approval from #36797 to this PR.

@roji roji merged commit a4f37c5 into dotnet:release/10.0 Sep 19, 2025
7 checks passed
@roji roji added the ask-mode label Sep 19, 2025
@roji roji changed the title SQL Server: Don't transform equality to bitwise operations in predicate contexts [rc2] SQL Server: Don't transform equality to bitwise operations in predicate contexts Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants