You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add CPU feature detection for Intel's Advanced Matrix Extensions (AMX). On wide CPU cores the Advanced Matrix Extensions have the potential to increase performance manyfold compared to AVX for certain operations.
Feature sets and Detection
Intel AMX consists of 3 feature sets. The AMX-TILE is the base instruction set. For both INT8 and BF16 there are Tile Matrix Multiply (TMUL) units with each an additional instruction set: AMX-INT8 and AMX-BF16.
Is this to help downstream? It seems to me that the application for NumPy itself would be very narrow (int8 matrix multiplication, which is completely unoptimized right now).
Proposed new feature or change:
Add CPU feature detection for Intel's Advanced Matrix Extensions (AMX). On wide CPU cores the Advanced Matrix Extensions have the potential to increase performance manyfold compared to AVX for certain operations.
Feature sets and Detection
Intel AMX consists of 3 feature sets. The
AMX-TILE
is the base instruction set. For both INT8 and BF16 there are Tile Matrix Multiply (TMUL) units with each an additional instruction set:AMX-INT8
andAMX-BF16
.See https://en.wikichip.org/wiki/x86/amx#Instructions
Development history
Intel AMX was merged into Linux in October 2021 and included in the 5.16 kernel. Support was backported to Ubuntu 22.04.1 LTS in August 2022.
Similar effords
Resources
This enhancement might be similar to #20821, #20552 and #22265.
The text was updated successfully, but these errors were encountered: