Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Math.FusedMultiplyAdd returns wrong results on systems without FMA3 support #98704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
moellerm opened this issue Feb 20, 2024 · 6 comments
Closed
Labels
area-System.Numerics tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Milestone

Comments

@moellerm
Copy link

Description

On systems without FMA3 instruction support (older CPUs, virtual machines with FMA3 disabled), the Math.FusedMultiplyAdd method returns incorrectly rounded results.

The documentation for the method states the following:

Returns (x * y) + z, rounded as one ternary operation.

This computes (x * y) as if to infinite precision, adds z to that result as if to infinite precision, and finally rounds to the nearest representable value.

This differs from the non-fused sequence which would compute (x * y) as if to infinite precision, round the result to the nearest representable value, add z to the rounded result as if to infinite precision, and finally round to the nearest representable value.

I would expect from this, that rounds to the nearest representable value is the same for arithmetic operations and the Math.FusedMultiplyAdd operation. That means, it is rounded to the nearest representable value (ties to even).

The rounding is correctly done if the CPU supports FMA3 in hardware. Without FMA3 support, you get different results from the same inputs. Beside being wrong, this hardware dependent change in behavior/result can lead to further issues.

Reproduction Steps

To reproduce the issue, evaluate the following method on a system with a CPU without FMA3 support.

Math.FusedMultiplyAdd(1.0000000000000002, 1.5, -2.220446049250313E-16)

Note: Setting COMPlus_EnableFMA=0 is not sufficient to reproduce the problem. The flag does not affect the chosen code path if hardware FMA3 is available. The only (known) way to reproduce it is by using a CPU without FMA3 or a system virtualized with VirtualBox (which does not expose/support FMA3 for the client system).

The input values are the following:

1.0000000000000002 = 1 + 2^-52
1.5 = 1 + 2^-1
-2.220446049250313E-16 = -(2^-52)

The correct result is

round((x * y) + z) = round((1 + 2^-52) * (1 + 2^-1) - (2^-52)) = round(1 + 2^-1 + 2^-53) = 1 + 2^-1 = 1.5

Expected behavior

The expected return value from the method is

1.5

Note: This is as well the value returned on systems with hardware support for FMA3.

Actual behavior

On a system with a CPU without FMA3 support, the method returns the wrong value

1.5000000000000002

This value is equal to

1.5000000000000002 = 1 + 2^-1 + 2^-52

which can be obtained by wrongly rounding the exact value 1 + 2^-1 + 2^-53 to the next odd least significant bit instead of rounding to the next even least significant bit.

Regression?

I have tested this on a system with dotnet-runtime-6.0.1-win-x64 and with dotnet-runtime-8.0.2-win-x64. The issue is present for both versions.

Known Workarounds

The only (known) workaround is to check the flag

System.Runtime.Intrinsics.X86.Fma.IsSupported

and if it is false to call another software implementation of FusedMultiplyAdd, for example a translation of the code

https://git.musl-libc.org/cgit/musl/tree/src/math/fma.c

Configuration

Tested .NET versions see, above. Tested on Windows 7 and Windows 10 operating systems.

Other information

Other input values that produce wrong results on systems without FMA3 support:

Math.FusedMultiplyAdd(-0.0070345722407623157, 0.97928941403450775, 6.5305722571759166e-19) should be -0.006888882127639542 but was -0.0068888821276395411
Math.FusedMultiplyAdd(0.00982626125901076, 0.99963622050958112, -7.6568823330795646e-19) should be 0.0098226866666972328 but was 0.0098226866666972345
Math.FusedMultiplyAdd(6.4258928890587477, 0.0073206296495525883, 2.8809760228891598e-18) should be 0.047041582008492615 but was 0.047041582008492608
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 20, 2024
@ghost
Copy link

ghost commented Feb 20, 2024

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

On systems without FMA3 instruction support (older CPUs, virtual machines with FMA3 disabled), the Math.FusedMultiplyAdd method returns incorrectly rounded results.

The documentation for the method states the following:

Returns (x * y) + z, rounded as one ternary operation.

This computes (x * y) as if to infinite precision, adds z to that result as if to infinite precision, and finally rounds to the nearest representable value.

This differs from the non-fused sequence which would compute (x * y) as if to infinite precision, round the result to the nearest representable value, add z to the rounded result as if to infinite precision, and finally round to the nearest representable value.

I would expect from this, that rounds to the nearest representable value is the same for arithmetic operations and the Math.FusedMultiplyAdd operation. That means, it is rounded to the nearest representable value (ties to even).

The rounding is correctly done if the CPU supports FMA3 in hardware. Without FMA3 support, you get different results from the same inputs. Beside being wrong, this hardware dependent change in behavior/result can lead to further issues.

Reproduction Steps

To reproduce the issue, evaluate the following method on a system with a CPU without FMA3 support.

Math.FusedMultiplyAdd(1.0000000000000002, 1.5, -2.220446049250313E-16)

Note: Setting COMPlus_EnableFMA=0 is not sufficient to reproduce the problem. The flag does not affect the chosen code path if hardware FMA3 is available. The only (known) way to reproduce it is by using a CPU without FMA3 or a system virtualized with VirtualBox (which does not expose/support FMA3 for the client system).

The input values are the following:

1.0000000000000002 = 1 + 2^-52
1.5 = 1 + 2^-1
-2.220446049250313E-16 = -(2^-52)

The correct result is

round((x * y) + z) = round((1 + 2^-52) * (1 + 2^-1) - (2^-52)) = round(1 + 2^-1 + 2^-53) = 1 + 2^-1 = 1.5

Expected behavior

The expected return value from the method is

1.5

Note: This is as well the value returned on systems with hardware support for FMA3.

Actual behavior

On a system with a CPU without FMA3 support, the method returns the wrong value

1.5000000000000002

This value is equal to

1.5000000000000002 = 1 + 2^-1 + 2^-52

which can be obtained by wrongly rounding the exact value 1 + 2^-1 + 2^-53 to the next odd least significant bit instead of rounding to the next even least significant bit.

Regression?

I have tested this on a system with dotnet-runtime-6.0.1-win-x64 and with dotnet-runtime-8.0.2-win-x64. The issue is present for both versions.

Known Workarounds

The only (known) workaround is to check the flag

System.Runtime.Intrinsics.X86.Fma.IsSupported

and if it is false to call another software implementation of FusedMultiplyAdd, for example a translation of the code

https://git.musl-libc.org/cgit/musl/tree/src/math/fma.c

Configuration

Tested .NET versions see, above. Tested on Windows 7 and Windows 10 operating systems.

Other information

Other input values that produce wrong results on systems without FMA3 support:

Math.FusedMultiplyAdd(-0.0070345722407623157, 0.97928941403450775, 6.5305722571759166e-19) should be -0.006888882127639542 but was -0.0068888821276395411
Math.FusedMultiplyAdd(0.00982626125901076, 0.99963622050958112, -7.6568823330795646e-19) should be 0.0098226866666972328 but was 0.0098226866666972345
Math.FusedMultiplyAdd(6.4258928890587477, 0.0073206296495525883, 2.8809760228891598e-18) should be 0.047041582008492615 but was 0.047041582008492608
Author: moellerm
Assignees: -
Labels:

area-System.Numerics

Milestone: -

@tannergooding
Copy link
Member

Can you clarify what operating system you're running against?

FMA simply defers down to the C runtime if hardware acceleration is not available. So my best guess is that there is potentially a bug in the underlying FMA algorithm used by the C runtime for whichever OS you've mentioned.

@moellerm
Copy link
Author

I have tested this on Windows 7 and Windows 10. I'm not sure which method in which dll finally gets called for the software fallback, but I suspect it is fma from ucrtbase.dll.

Based on the documentation

https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fma-fmaf-fmal?view=msvc-170

and another bug report for it

https://developercommunity.visualstudio.com/t/C-std::fma-returns-incorrect-result/242309

I get the impression, that this is not really a complete software implementation of FusedMultiplyAdd but rather a method that produces an estimate for the value that FusedMultiplyAdd should return.

@tannergooding
Copy link
Member

I get the impression, that this is not really a complete software implementation of FusedMultiplyAdd but rather a method that produces an estimate for the value that FusedMultiplyAdd should return.

Well, that's definitely a bug, the C specification requires that fma behave correctly and not estimate (so unless the user opts into fp:fast, it shouldn't happen).

@tannergooding
Copy link
Member

I've opened a new bug against MSVC here: https://developercommunity.visualstudio.com/t/MSVCs-fma-implementation-is-incorrect-o/10594003. It covers the IEEE 754 and C Programing Language specification requirements here.

@tannergooding tannergooding added tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly and removed untriaged New issue has not been triaged by the area owner labels Jun 24, 2024
@tannergooding tannergooding added this to the Future milestone Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Projects
None yet
Development

No branches or pull requests

3 participants