-
Notifications
You must be signed in to change notification settings - Fork 5k
JIT: Avx512BW Compare Debug/Release difference #114978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Looking at the disasm for following
So this essentially creates a mask with 16 bits set Looks like
But that doesn't work since this requires all 64 bits. Didn't really check in detail but looks like CSE doesn't account for vector size difference. Look for |
It's unclear why this would even CSE in the first place, as they aren't doing equivalent comparisons. Perhaps this is rather an issue with the immediate operand not being checked or similar? It's definitely possible the base type or simd size is also getting missed for some scenario. |
In both comparison cases, the comparison is statically known to be true:
value numbering sees the first one:
and produces a
when it gets to:
it gives it the same value:
This leads to a perfectly reasonable CSE of the mask register. However, when generating the mask register value using So:
|
There is some code in
|
This should be generally fine and is intentional since we only have The bug seems to be:
Because this isn't actually producing such an It's actually not even clear why or what is generating runtime/src/coreclr/jit/codegenxarch.cpp Line 573 in 6be6c5d
|
Ok, I see. We end up with the liberal VN being a constant but the conservative VN not being a constant. CSE only looks at the conservative VN when checking for constants because AssertionProp only looks at the conservative VN when doing constant propagation. This means we CSE Likewise for So rather than producing a VN of |
cc @dotnet/jit-contrib @dotnet/intel
The text was updated successfully, but these errors were encountered: