-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Ensure that the various Min and Max APIs are accelerated where possible #87641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis mostly does AVX-512 based acceleration but it also ensures that these are supported on Arm64 where that's trivial to do.
|
|
||
#if defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH) | ||
case NI_System_Math_Max: | ||
case NI_System_Math_Min: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic was all extracted to a new helper function impMinMaxIntrinsic
, to avoid code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing so made the diff unhappy, and so it's showing a bunch of code as "new" where it actually isn't
4cce32d
to
0e6353c
Compare
CC. @dotnet/jit-contrib, @dotnet/avx512-contrib As per the top post this ensures various Min/Max APIs are accelerated using available instructions on both Arm64 and x86/x64. For the latter it utilizes AVX-512 where possible. Diffs look good with us saving 16k bytes of codegen on Linux x64 and 29k bytes of codegen on Windows x64 (significantly simpler code that is now branch-free). Cleaning up the intrinsic recognition path also brings a |
This mostly does AVX-512 based acceleration but it also ensures that these are supported on Arm64 where that's trivial to do.