Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

tannergooding
Copy link
Member

This mostly does AVX-512 based acceleration but it also ensures that these are supported on Arm64 where that's trivial to do.

@ghost ghost assigned tannergooding Jun 15, 2023
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 15, 2023
@ghost
Copy link

ghost commented Jun 15, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This mostly does AVX-512 based acceleration but it also ensures that these are supported on Arm64 where that's trivial to do.

Author: tannergooding
Assignees: tannergooding
Labels:

area-CodeGen-coreclr

Milestone: -


#if defined(FEATURE_HW_INTRINSICS) && defined(TARGET_XARCH)
case NI_System_Math_Max:
case NI_System_Math_Min:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic was all extracted to a new helper function impMinMaxIntrinsic, to avoid code duplication.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing so made the diff unhappy, and so it's showing a bunch of code as "new" where it actually isn't

@tannergooding tannergooding marked this pull request as ready for review June 16, 2023 17:24
@tannergooding tannergooding added the avx512 Related to the AVX-512 architecture label Jun 16, 2023
@tannergooding
Copy link
Member Author

CC. @dotnet/jit-contrib, @dotnet/avx512-contrib

As per the top post this ensures various Min/Max APIs are accelerated using available instructions on both Arm64 and x86/x64. For the latter it utilizes AVX-512 where possible.

Diffs look good with us saving 16k bytes of codegen on Linux x64 and 29k bytes of codegen on Windows x64 (significantly simpler code that is now branch-free).

Cleaning up the intrinsic recognition path also brings a -0.12% TP improvement for ASP.NET and some smaller improvements for other scenarios.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants