Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

moellerm · 2024-02-20T17:25:24Z

Description

On systems without FMA3 instruction support (older CPUs, virtual machines with FMA3 disabled), the Math.FusedMultiplyAdd method returns incorrectly rounded results.

The documentation for the method states the following:

Returns (x * y) + z, rounded as one ternary operation.

This computes (x * y) as if to infinite precision, adds z to that result as if to infinite precision, and finally rounds to the nearest representable value.

This differs from the non-fused sequence which would compute (x * y) as if to infinite precision, round the result to the nearest representable value, add z to the rounded result as if to infinite precision, and finally round to the nearest representable value.

I would expect from this, that rounds to the nearest representable value is the same for arithmetic operations and the Math.FusedMultiplyAdd operation. That means, it is rounded to the nearest representable value (ties to even).

The rounding is correctly done if the CPU supports FMA3 in hardware. Without FMA3 support, you get different results from the same inputs. Beside being wrong, this hardware dependent change in behavior/result can lead to further issues.

Reproduction Steps

To reproduce the issue, evaluate the following method on a system with a CPU without FMA3 support.

Math.FusedMultiplyAdd(1.0000000000000002, 1.5, -2.220446049250313E-16)

Note: Setting COMPlus_EnableFMA=0 is not sufficient to reproduce the problem. The flag does not affect the chosen code path if hardware FMA3 is available. The only (known) way to reproduce it is by using a CPU without FMA3 or a system virtualized with VirtualBox (which does not expose/support FMA3 for the client system).

The input values are the following:

1.0000000000000002 = 1 + 2^-52
1.5 = 1 + 2^-1
-2.220446049250313E-16 = -(2^-52)

The correct result is

round((x * y) + z) = round((1 + 2^-52) * (1 + 2^-1) - (2^-52)) = round(1 + 2^-1 + 2^-53) = 1 + 2^-1 = 1.5

Expected behavior

The expected return value from the method is

1.5

Note: This is as well the value returned on systems with hardware support for FMA3.

Actual behavior

On a system with a CPU without FMA3 support, the method returns the wrong value

1.5000000000000002

This value is equal to

1.5000000000000002 = 1 + 2^-1 + 2^-52

which can be obtained by wrongly rounding the exact value 1 + 2^-1 + 2^-53 to the next odd least significant bit instead of rounding to the next even least significant bit.

Regression?

I have tested this on a system with dotnet-runtime-6.0.1-win-x64 and with dotnet-runtime-8.0.2-win-x64. The issue is present for both versions.

Known Workarounds

The only (known) workaround is to check the flag

System.Runtime.Intrinsics.X86.Fma.IsSupported

and if it is false to call another software implementation of FusedMultiplyAdd, for example a translation of the code

https://git.musl-libc.org/cgit/musl/tree/src/math/fma.c

Configuration

Tested .NET versions see, above. Tested on Windows 7 and Windows 10 operating systems.

Other information

Other input values that produce wrong results on systems without FMA3 support:

Math.FusedMultiplyAdd(-0.0070345722407623157, 0.97928941403450775, 6.5305722571759166e-19) should be -0.006888882127639542 but was -0.0068888821276395411
Math.FusedMultiplyAdd(0.00982626125901076, 0.99963622050958112, -7.6568823330795646e-19) should be 0.0098226866666972328 but was 0.0098226866666972345
Math.FusedMultiplyAdd(6.4258928890587477, 0.0073206296495525883, 2.8809760228891598e-18) should be 0.047041582008492615 but was 0.047041582008492608

The text was updated successfully, but these errors were encountered:

ghost · 2024-02-20T17:25:32Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

On systems without FMA3 instruction support (older CPUs, virtual machines with FMA3 disabled), the Math.FusedMultiplyAdd method returns incorrectly rounded results.

The documentation for the method states the following:

Returns (x * y) + z, rounded as one ternary operation.

This computes (x * y) as if to infinite precision, adds z to that result as if to infinite precision, and finally rounds to the nearest representable value.

This differs from the non-fused sequence which would compute (x * y) as if to infinite precision, round the result to the nearest representable value, add z to the rounded result as if to infinite precision, and finally round to the nearest representable value.

I would expect from this, that rounds to the nearest representable value is the same for arithmetic operations and the Math.FusedMultiplyAdd operation. That means, it is rounded to the nearest representable value (ties to even).

The rounding is correctly done if the CPU supports FMA3 in hardware. Without FMA3 support, you get different results from the same inputs. Beside being wrong, this hardware dependent change in behavior/result can lead to further issues.

Reproduction Steps

To reproduce the issue, evaluate the following method on a system with a CPU without FMA3 support.

Math.FusedMultiplyAdd(1.0000000000000002, 1.5, -2.220446049250313E-16)

Note: Setting COMPlus_EnableFMA=0 is not sufficient to reproduce the problem. The flag does not affect the chosen code path if hardware FMA3 is available. The only (known) way to reproduce it is by using a CPU without FMA3 or a system virtualized with VirtualBox (which does not expose/support FMA3 for the client system).

The input values are the following:

1.0000000000000002 = 1 + 2^-52
1.5 = 1 + 2^-1
-2.220446049250313E-16 = -(2^-52)

The correct result is

round((x * y) + z) = round((1 + 2^-52) * (1 + 2^-1) - (2^-52)) = round(1 + 2^-1 + 2^-53) = 1 + 2^-1 = 1.5

Expected behavior

The expected return value from the method is

1.5

Note: This is as well the value returned on systems with hardware support for FMA3.

Actual behavior

On a system with a CPU without FMA3 support, the method returns the wrong value

1.5000000000000002

This value is equal to

1.5000000000000002 = 1 + 2^-1 + 2^-52

which can be obtained by wrongly rounding the exact value 1 + 2^-1 + 2^-53 to the next odd least significant bit instead of rounding to the next even least significant bit.

Regression?

I have tested this on a system with dotnet-runtime-6.0.1-win-x64 and with dotnet-runtime-8.0.2-win-x64. The issue is present for both versions.

Known Workarounds

The only (known) workaround is to check the flag

System.Runtime.Intrinsics.X86.Fma.IsSupported

and if it is false to call another software implementation of FusedMultiplyAdd, for example a translation of the code

https://git.musl-libc.org/cgit/musl/tree/src/math/fma.c

Configuration

Tested .NET versions see, above. Tested on Windows 7 and Windows 10 operating systems.

Other information

Other input values that produce wrong results on systems without FMA3 support:

Math.FusedMultiplyAdd(-0.0070345722407623157, 0.97928941403450775, 6.5305722571759166e-19) should be -0.006888882127639542 but was -0.0068888821276395411
Math.FusedMultiplyAdd(0.00982626125901076, 0.99963622050958112, -7.6568823330795646e-19) should be 0.0098226866666972328 but was 0.0098226866666972345
Math.FusedMultiplyAdd(6.4258928890587477, 0.0073206296495525883, 2.8809760228891598e-18) should be 0.047041582008492615 but was 0.047041582008492608

Author:	moellerm
Assignees:	-
Labels:	`area-System.Numerics`
Milestone:	-

tannergooding · 2024-02-20T17:48:59Z

Can you clarify what operating system you're running against?

FMA simply defers down to the C runtime if hardware acceleration is not available. So my best guess is that there is potentially a bug in the underlying FMA algorithm used by the C runtime for whichever OS you've mentioned.

moellerm · 2024-02-20T18:09:49Z

I have tested this on Windows 7 and Windows 10. I'm not sure which method in which dll finally gets called for the software fallback, but I suspect it is fma from ucrtbase.dll.

Based on the documentation

https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fma-fmaf-fmal?view=msvc-170

and another bug report for it

https://developercommunity.visualstudio.com/t/C-std::fma-returns-incorrect-result/242309

I get the impression, that this is not really a complete software implementation of FusedMultiplyAdd but rather a method that produces an estimate for the value that FusedMultiplyAdd should return.

tannergooding · 2024-02-20T18:43:47Z

I get the impression, that this is not really a complete software implementation of FusedMultiplyAdd but rather a method that produces an estimate for the value that FusedMultiplyAdd should return.

Well, that's definitely a bug, the C specification requires that fma behave correctly and not estimate (so unless the user opts into fp:fast, it shouldn't happen).

tannergooding · 2024-02-20T18:58:05Z

I've opened a new bug against MSVC here: https://developercommunity.visualstudio.com/t/MSVCs-fma-implementation-is-incorrect-o/10594003. It covers the IEEE 754 and C Programing Language specification requirements here.

clairvoyante · 2025-05-06T02:46:53Z

https://developercommunity.visualstudio.com/t/MSVCs-fma-implementation-is-incorrect-o/10594003#T-N10885754

dotnet-issue-labeler bot added the area-System.Numerics label Feb 20, 2024

ghost added the untriaged New issue has not been triaged by the area owner label Feb 20, 2024

clairvoyante mentioned this issue Feb 21, 2024

Fused multiply-add: proposal to add math.fma() python/cpython#73468

Closed

sDIMMaX mentioned this issue Mar 7, 2024

2.5.0-beta.20 - Convex hull error bepu/bepuphysics2#313

Closed

tannergooding added tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly and removed untriaged New issue has not been triaged by the area owner labels Jun 24, 2024

tannergooding added this to the Future milestone Jun 24, 2024

tannergooding closed this as completed May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

moellerm commented Feb 20, 2024

ghost commented Feb 20, 2024

Description

Reproduction Steps

Expected behavior

Actual behavior

Regression?

Known Workarounds

Configuration

Other information

tannergooding commented Feb 20, 2024

moellerm commented Feb 20, 2024

tannergooding commented Feb 20, 2024

tannergooding commented Feb 20, 2024

clairvoyante commented May 6, 2025

Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

Comments

moellerm commented Feb 20, 2024

Description

Reproduction Steps

Expected behavior

Actual behavior

Regression?

Known Workarounds

Configuration

Other information

ghost commented Feb 20, 2024

Description

Reproduction Steps

Expected behavior

Actual behavior

Regression?

Known Workarounds

Configuration

Other information

tannergooding commented Feb 20, 2024

moellerm commented Feb 20, 2024

tannergooding commented Feb 20, 2024

tannergooding commented Feb 20, 2024

clairvoyante commented May 6, 2025