-
Notifications
You must be signed in to change notification settings - Fork 5k
[API Proposal]: Add AVX-VNNI-INT8 and AVX-VNNI-INT16 API #112586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
@anthonycanino @tannergooding @saucecontrol for review |
@khushal1996 these instructions accumulate into an existing sum, so that accumulator needs to be an argument to the method:
See: #110032 (comment) and the accepted shape for AVX-VNNI Also, we don't need the |
Thanks @saucecontrol. I have update the API doc accordingly. |
@saucecontrol @tannergooding let me know if you have any more reviews/concerns and we can take this to approval |
Sending out a reminder for this issue @tannergooding @saucecontrol |
Looks right to me now 👍 |
Closing this issue and considering this approved. We will start the implementation soon. |
This still needs to goto API review for formal approval, but I don't expect any changes from the surface area here. |
I would like to propose that we change the following APIS for
to...
And following APIS from
to...
Mainly because of how the intrinsic is internally implemented Since for saturating APIs in cases of I was working on the implementation and came to this conclusion due to some failing template tests. We can still handle the failing template tests by manipulating the result but the API would not work in the way it is shown in the document. Let me know if we want to do this. For non saturating case, we can still trim down the value and return the int value but for saturating case, we will override the expected behavior if we define them as |
We have:
We then have saturating and non-saturating versions of each. So I would expect we have: Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right);
Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right);
Vector128<uint> MultiplyWideningAndAdd(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right);
Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right);
Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right);
Vector128<uint> MultiplyWideningAndAddSaturate(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right); |
Thanks @tannergooding. To clarify, I would make similar changes in |
Yep, sounds good and like the right fixes to make. |
Updated API doc AVXVNNIINT8 // Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System.Diagnostics.CodeAnalysis;
using System.Runtime.CompilerServices;
namespace System.Runtime.Intrinsics.X86
{
/// <summary>Provides access to the x86 AVXVNNI hardware instructions via intrinsics.</summary>
[Intrinsic]
[CLSCompliant(false)]
public abstract class AvxVnniInt8 : Avx2
{
internal AvxVnniInt8() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static new bool IsSupported { get => IsSupported; }
/// <summary>Provides access to the x86 AVX-VNNI-INT8 hardware instructions, that are only available to 64-bit processes, via intrinsics.</summary>
[Intrinsic]
public new abstract class X64 : Avx2.X64
{
internal X64() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static new bool IsSupported { get => IsSupported; }
}
// VPDPBSSD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<sbyte> left, Vector128<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<sbyte> left, Vector128<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAdd(Vector128<uint> addend, Vector128<byte> left, Vector128<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<sbyte> left, Vector256<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<sbyte> left, Vector256<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAdd(Vector256<uint> addend, Vector256<byte> left, Vector256<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<sbyte> left, Vector128<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<sbyte> left, Vector128<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAddSaturate(Vector128<uint> addend, Vector128<byte> left, Vector128<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSSDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<sbyte> left, Vector256<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<sbyte> left, Vector256<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAddSaturate(Vector256<uint> addend, Vector256<byte> left, Vector256<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
/// <summary>Provides access to the x86 AVX10.2/512 hardware instructions for AVX-VNNI-INT8 via intrinsics.</summary>
[Intrinsic]
public abstract class V512
{
internal V512() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static bool IsSupported { get => IsSupported; }
// VPDPBSSD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<sbyte> left, Vector512<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<sbyte> left, Vector512<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAdd(Vector512<uint> addend, Vector512<byte> left, Vector512<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<sbyte> left, Vector512<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<sbyte> left, Vector512<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAddSaturate(Vector512<uint> addend, Vector512<byte> left, Vector512<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
}
}
} AVXVNNIINT16 // Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
using System.Diagnostics.CodeAnalysis;
using System.Runtime.CompilerServices;
namespace System.Runtime.Intrinsics.X86
{
/// <summary>Provides access to the x86 AVXVNNI hardware instructions via intrinsics.</summary>
[Intrinsic]
[CLSCompliant(false)]
public abstract class AvxVnniInt16 : Avx2
{
internal AvxVnniInt16() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static new bool IsSupported { get => IsSupported; }
/// <summary>Provides access to the x86 AVX-VNNI-INT8 hardware instructions, that are only available to 64-bit processes, via intrinsics.</summary>
[Intrinsic]
public new abstract class X64 : Avx2.X64
{
internal X64() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static new bool IsSupported { get => IsSupported; }
}
// VPDPWSUD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAdd(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<short> left, Vector256<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<ushort> left, Vector256<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAdd(Vector256<uint> addend, Vector256<ushort> left, Vector256<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAddSaturate(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWSUDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<short> left, Vector256<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<ushort> left, Vector256<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAddSaturate(Vector256<uint> addend, Vector256<ushort> left, Vector256<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
/// <summary>Provides access to the x86 AVX10.2/512 hardware instructions for AVX-VNNI-INT16 via intrinsics.</summary>
[Intrinsic]
public abstract class V512
{
internal V512() { }
/// <summary>Gets a value that indicates whether the APIs in this class are supported.</summary>
/// <value><see langword="true" /> if the APIs are supported; otherwise, <see langword="false" />.</value>
/// <remarks>A value of <see langword="false" /> indicates that the APIs will throw <see cref="PlatformNotSupportedException" />.</remarks>
public static bool IsSupported { get => IsSupported; }
// VPDPWSUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<short> left, Vector512<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<ushort> left, Vector512<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAdd(Vector512<uint> addend, Vector512<ushort> left, Vector512<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<short> left, Vector512<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<ushort> left, Vector512<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAddSaturate(Vector512<uint> addend, Vector512<ushort> left, Vector512<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
}
}
} |
Can you update the top post, as that's what API review will be looking at when it bubbles up to be looked at? |
Updated. |
@tannergooding do we have a timeframe for a review on this and #113090? Once they are approved, I think we can open a PR implementing both in full. |
It's likely a couple weeks out given the placement of it in the API review list: https://apireview.net/ |
namespace System.Runtime.Intrinsics.X86
{
[Intrinsic]
[CLSCompliant(false)]
public abstract class AvxVnniInt8 : Avx2
{
internal AvxVnniInt8() { }
public static new bool IsSupported { get => IsSupported; }
[Intrinsic]
public new abstract class X64 : Avx2.X64
{
internal X64() { }
public static new bool IsSupported { get => IsSupported; }
}
// VPDPBSSD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<sbyte> left, Vector128<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<sbyte> left, Vector128<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAdd(Vector128<uint> addend, Vector128<byte> left, Vector128<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<sbyte> left, Vector256<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<sbyte> left, Vector256<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAdd(Vector256<uint> addend, Vector256<byte> left, Vector256<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<sbyte> left, Vector128<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<sbyte> left, Vector128<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAddSaturate(Vector128<uint> addend, Vector128<byte> left, Vector128<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSSDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<sbyte> left, Vector256<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<sbyte> left, Vector256<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAddSaturate(Vector256<uint> addend, Vector256<byte> left, Vector256<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
[Intrinsic]
public abstract class V512
{
internal V512() { }
public static bool IsSupported { get => IsSupported; }
// VPDPBSSD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<sbyte> left, Vector512<sbyte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<sbyte> left, Vector512<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBUUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAdd(Vector512<uint> addend, Vector512<byte> left, Vector512<byte> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPBSSDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<sbyte> left, Vector512<sbyte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBSUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<sbyte> left, Vector512<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPBUUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAddSaturate(Vector512<uint> addend, Vector512<byte> left, Vector512<byte> right) => MultiplyWideningAndAddSaturate(addend, left, right);
}
}
[Intrinsic]
[CLSCompliant(false)]
public abstract class AvxVnniInt16 : Avx2
{
internal AvxVnniInt16() { }
public static new bool IsSupported { get => IsSupported; }
/// <summary>Provides access to the x86 AVX-VNNI-INT8 hardware instructions, that are only available to 64-bit processes, via intrinsics.</summary>
[Intrinsic]
public new abstract class X64 : Avx2.X64
{
internal X64() { }
public static new bool IsSupported { get => IsSupported; }
}
// VPDPWSUD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAdd(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAdd(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<short> left, Vector256<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAdd(Vector256<int> addend, Vector256<ushort> left, Vector256<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAdd(Vector256<uint> addend, Vector256<ushort> left, Vector256<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<short> left, Vector128<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS xmm1, xmm2, xmm3/m128
public static Vector128<int> MultiplyWideningAndAddSaturate(Vector128<int> addend, Vector128<ushort> left, Vector128<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS xmm1, xmm2, xmm3/m128
public static Vector128<uint> MultiplyWideningAndAddSaturate(Vector128<uint> addend, Vector128<ushort> left, Vector128<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWSUDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<short> left, Vector256<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS ymm1, ymm2, ymm3/m256
public static Vector256<int> MultiplyWideningAndAddSaturate(Vector256<int> addend, Vector256<ushort> left, Vector256<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS ymm1, ymm2, ymm3/m256
public static Vector256<uint> MultiplyWideningAndAddSaturate(Vector256<uint> addend, Vector256<ushort> left, Vector256<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
/// <summary>Provides access to the x86 AVX10.2/512 hardware instructions for AVX-VNNI-INT16 via intrinsics.</summary>
[Intrinsic]
public abstract class V512
{
internal V512() { }
public static bool IsSupported { get => IsSupported; }
// VPDPWSUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<short> left, Vector512<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUSD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAdd(Vector512<int> addend, Vector512<ushort> left, Vector512<short> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWUUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAdd(Vector512<uint> addend, Vector512<ushort> left, Vector512<ushort> right) => MultiplyWideningAndAdd(addend, left, right);
// VPDPWSUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<short> left, Vector512<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUSDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> MultiplyWideningAndAddSaturate(Vector512<int> addend, Vector512<ushort> left, Vector512<short> right) => MultiplyWideningAndAddSaturate(addend, left, right);
// VPDPWUUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<uint> MultiplyWideningAndAddSaturate(Vector512<uint> addend, Vector512<ushort> left, Vector512<ushort> right) => MultiplyWideningAndAddSaturate(addend, left, right);
}
}
} |
Background and motivation
This API proposal introduces API surface for
AVX-VNNI-INT8
andAVX-VNNI-INT16
in .NET.Spec doc - Link
As a part of this proposal, we will have a
V512
class to represent a relationship betweenAVX10.2
andAVX-VNNI-INT8
/AVX-VNNI-INT16
ISAs as discussed here (link)A dependency will be added for
Avx10.2
API Proposal
AVX-VNNI-INT8
AVX-VNNI-INT16
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: