BUG: VSX3 optimizations broken with float16 on big-endian #25178

matoro · 2023-11-19T00:06:58Z

Describe the issue:

Downstream bug: https://bugs.gentoo.org/917544

When compiling for a target supporting VSX3 optimizations such as -mcpu=power9 on big-endian, numpy.float16 is broken. Targets at VSX2 and below, i.e. -mcpu=power8, are fine.

The problem does not reproduce on little-endian at any optimization level.

This is reflected in the tests, which pass at -mcpu=power8 and have 255 failures at -mcpu=power9. Full test logs: build.log

The snippet below is a minimized reproducer which demonstrates the problem.

Unfortunately impossible to compare against older versions, because this is the first working version using meson which even compiles with -mcpu=power9 due to #24789 .

If you don't have it available, I offer free shell access to the machines I used to reproduce this here.

CC @seiko2plus @zeldin

Reproduce the code example:

import numpy as np
np.float16(1.0)
np.float16(1.0)+0.0    # operand is widened before addition
np.float16(1.0)+np.float16(0.0)

Error message:

Python 3.11.5 (main, Oct 21 2023, 17:50:00) [GCC 12.3.1 20230526] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
/usr/lib/python3.11/site-packages/numpy/core/getlimits.py:52: RuntimeWarning: divide by zero encountered in log10
  self.precision = int(-log10(self.eps))
>>> np.float16(1.0)
1.0
>>> np.float16(1.0)+0.0    # operand is widened before addition
1.0
>>> np.float16(1.0)+np.float16(0.0)
0.0

Runtime information:

Python 3.11.5 (main, Aug 28 2023, 05:57:37) [GCC 12.3.1 20230526] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, numpy; print(numpy.__version__); print(sys.version)
1.26.2
3.11.5 (main, Aug 28 2023, 05:57:37) [GCC 12.3.1 20230526]
>>> print(numpy.show_runtime())
WARNING: `threadpoolctl` not found in system! Install it by `pip install threadpoolctl`. Once installed, try `np.show_runtime` again for more detailed build information
[{'numpy_version': '1.26.2',
  'python': '3.11.5 (main, Aug 28 2023, 05:57:37) [GCC 12.3.1 20230526]',
  'uname': uname_result(system='Linux', node='matoro-ppc64dev', release='6.6.1-gentoo-ppc64', version='#1 SMP Wed Nov  8 14:31:08 EST 2023', machine='ppc64')},
 {'simd_extensions': {'baseline': ['VSX', 'VSX2'],
                      'found': ['VSX3'],
                      'not_found': ['VSX4']}}]
None

Context for the issue:

No response

The text was updated successfully, but these errors were encountered:

seiko2plus · 2023-11-20T02:46:58Z

Thanks for reporting this issue, #25195 should resolve it.

mattip · 2023-11-20T20:11:28Z

@matoro could you confirm the fix from #25195 which was merged?

matoro · 2023-11-20T20:29:35Z

@matoro could you confirm the fix from #25195 which was merged?

The original reporter already confirmed it in our bug tracker! https://bugs.gentoo.org/917544#c8

matoro added the 00 - Bug label Nov 19, 2023

charris added this to the 1.26.3 release milestone Nov 19, 2023

seiko2plus mentioned this issue Nov 20, 2023

BUG: Fix single to half-precision conversion on PPC64/VSX3 #25195

Merged

mattip closed this as completed in #25195 Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: VSX3 optimizations broken with float16 on big-endian #25178

BUG: VSX3 optimizations broken with float16 on big-endian #25178

matoro commented Nov 19, 2023

seiko2plus commented Nov 20, 2023

Uh oh!

mattip commented Nov 20, 2023

Uh oh!

matoro commented Nov 20, 2023

Uh oh!

Uh oh!

BUG: VSX3 optimizations broken with float16 on big-endian #25178

BUG: VSX3 optimizations broken with float16 on big-endian #25178

Comments

matoro commented Nov 19, 2023

Describe the issue:

Reproduce the code example:

Error message:

Runtime information:

Context for the issue:

seiko2plus commented Nov 20, 2023

Uh oh!

mattip commented Nov 20, 2023

Uh oh!

matoro commented Nov 20, 2023

Uh oh!