ENH, SIMD: Add new NPYV intrinsics pack(1) #17790

seiko2plus · 2020-11-17T05:18:20Z

ENH, SIMD: Add new NPYV intrinsics pack(1)

add bitwise logical operations for boolean vectors
add round conversion for float vectors
add NAN test for float vectors
add conditional addition and subtraction
add #definition NPY_SIMD_FMA3 to check FUSED native support
add testing cases for all of the above

required by #17587
merge after #17789

TODO:

test it on armhf
remove temporary commit of ENH, SIMD: Add new NPYV intrinsics pack(0) #17789

seiko2plus · 2020-12-14T16:43:04Z

ping @mattip

mattip · 2020-12-14T21:15:16Z

Is there a way to get comments into the tests so it is clear which test is relating to which npyv_* primitive?

seiko2plus · 2020-12-15T01:12:15Z

@mattip, we can add the last NPYV calls to the traceback but the current traceback is pretty good, it has the tested SIMD target, data types, and SIMD width.

traceback sample

self = <numpy.core.tests.test_simd.Test_SIMD_FP32_256_FMA3__AVX2_f32 object at 0x7fbfc1eade20>

    def test_conversions(self):
        features = self._cpu_features()
        if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
            # very costly to emulate nearest even on Armv7
            # instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
            _round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
        else:
            _round = round
    
        vdata_a = self.load(self._data())
        vdata_a = self.sub(vdata_a, self.setall(0.5))
        data_round = [_round(x) for x in vdata_a]
        vround = self.round_s32(vdata_a)
>       assert vround != data_round
E       assert <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]> != [0, 2, 2, 4, 4, 6, ...]

_round     = <built-in function round>
data_round = [0, 2, 2, 4, 4, 6, ...]
features   = 'FMA3 AVX2'
self       = <numpy.core.tests.test_simd.Test_SIMD_FP32_256_FMA3__AVX2_f32 object at 0x7fbfc1eade20>
vdata_a    = <npyv_f32 of [0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]>
vround     = <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]>

numpy/core/tests/test_simd.py:219: AssertionError
__________________________________________________________________________________ Test_SIMD_FP32_128_SSE42_f32.test_conversions __________________________________________________________________________________

self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_SSE42_f32 object at 0x7fbfc1be1e20>

    def test_conversions(self):
        features = self._cpu_features()
        if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
            # very costly to emulate nearest even on Armv7
            # instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
            _round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
        else:
            _round = round
    
        vdata_a = self.load(self._data())
        vdata_a = self.sub(vdata_a, self.setall(0.5))
        data_round = [_round(x) for x in vdata_a]
        vround = self.round_s32(vdata_a)
>       assert vround != data_round
E       assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]

_round     = <built-in function round>
data_round = [0, 2, 2, 4]
features   = 'SSE42'
self       = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_SSE42_f32 object at 0x7fbfc1be1e20>
vdata_a    = <npyv_f32 of [0.5, 1.5, 2.5, 3.5]>
vround     = <npyv_s32 of [0, 2, 2, 4]>

numpy/core/tests/test_simd.py:219: AssertionError
________________________________________________________________________________ Test_SIMD_FP32_128_baseline_f32.test_conversions _________________________________________________________________________________

self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_baseline_f32 object at 0x7fbfc1b56a90>

    def test_conversions(self):
        features = self._cpu_features()
        if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
            # very costly to emulate nearest even on Armv7
            # instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
            _round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
        else:
            _round = round
    
        vdata_a = self.load(self._data())
        vdata_a = self.sub(vdata_a, self.setall(0.5))
        data_round = [_round(x) for x in vdata_a]
        vround = self.round_s32(vdata_a)
>       assert vround != data_round
E       assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]

_round     = <built-in function round>
data_round = [0, 2, 2, 4]
features   = 'SSE SSE2 SSE3'
self       = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_baseline_f32 object at 0x7fbfc1b56a90>
vdata_a    = <npyv_f32 of [0.5, 1.5, 2.5, 3.5]>
vround     = <npyv_s32 of [0, 2, 2, 4]>

numpy/core/tests/test_simd.py:219: AssertionError
============================================================================================= short test summary info =============================================================================================
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_256_FMA3__AVX2_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]> != [0, 2, 2, 4, 4, 6, ...]
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_128_SSE42_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_128_baseline_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]

mattip · 2020-12-15T10:11:48Z

sorry for not being clear. I meant adding some kind of comment or other notation in the test file itself to make it clear what intrinsics each test is checking. This will make it easier to review since the reviewer can see that the added passing test is aimed at testing a specific intrinsic or set of intrinsics.

seiko2plus · 2020-12-15T21:55:38Z

numpy/core/tests/test_simd.py

+    def test_mask_conditional(self):
+        vdata_a = self.load(self._data())
+        vdata_b = self.load(self._data(reverse=True))
+        true_mask  = self.cmpeq(self.zero(), self.zero())
+        false_mask = self.cmpneq(self.zero(), self.zero())
+
+        data_sub = self.sub(vdata_b, vdata_a)
+        ifsub = self.ifsub(true_mask, vdata_b, vdata_a, vdata_b)
+        assert ifsub == data_sub
+        ifsub = self.ifsub(false_mask, vdata_a, vdata_b, vdata_b)
+        assert ifsub == vdata_b
+
+        data_add = self.add(vdata_b, vdata_a)
+        ifadd = self.ifadd(true_mask, vdata_b, vdata_a, vdata_b)
+        assert ifadd == data_add
+        ifadd = self.ifadd(false_mask, vdata_a, vdata_b, vdata_b)
+        assert ifadd == vdata_b
+


@mattip,

I meant adding some kind of comment or other notation in the test file itself to make it clear what intrinsics each test is checking.

please take a look here, we are here testing mask conditional operations on several SIMD extensions with different data types. in the same time, we count on another NPYV intrinsics for generating the testing data and args in order to shrink the amount of code.

in other words, the code here equivalent to the following code except it testing all vector data types on all supported SIMD targets. sweet isn't it?:

following code

from numpy.core._simd import targets npyv = targets["baseline"] vdata_a = npyv.load_u8(range(npyv.nlanes_u8)) vdata_b = npyv.load_u8(reversed(range(npyv.nlanes_u8))) true_mask = npyv.cmpeq_u8(npyv.zero_u8(), npyv.zero_u8()) false_mask = npyv.cmpneq_u8(npyv.zero_u8(), npyv.zero_u8()) data_sub = npyv.sub_u8(vdata_b, vdata_a) ifsub = npyv.ifsub_u8(true_mask, vdata_b, vdata_a, vdata_b) assert ifsub == data_sub ifsub = npyv.ifsub_u8(false_mask, vdata_a, vdata_b, vdata_b) assert ifsub == vdata_b data_add = npyv.add_u8(vdata_b, vdata_a) ifadd = npyv.ifadd_u8(true_mask, vdata_b, vdata_a, vdata_b) assert ifadd == data_add ifadd = npyv.ifadd_u8(false_mask, vdata_a, vdata_b, vdata_b) assert ifadd == vdata_b

So I wonder, How can we add notations or special marks? we can replace parm name self to npyv
but would it help? I guess not. The only solution in my head right now is adding a docstring for each testing function clarifying the following:

the place where these intrinsics are defined in the source

the signatures of each one of them

Let's start with a hand written docstring that just mentions "testing cmpeq_u8, cmpneq_u8, ..." for every new test in this PR. No need to do more than that now. Then the reviewers can see which intrinsics have been added and that there are tests that cover them.

seiko2plus · 2020-12-18T05:04:48Z

numpy/core/tests/test_simd.py

+        Conditional addition and subtraction for all supported data types.
+        Samples:
+            npyv_s32 npyv_ifadd_s32(npyv_b32 mask, npyv_s32 a, npyv_s32 b, npyv_s32 c) ->
+                mask ? a + b : c
+            npyv_f64 npyv_ifsub_f64(npyv_b64 mask, npyv_f64 a, npyv_f64 b, npyv_f64 c) ->
+                mask ? a - b : c


@mattip, Is that okay?

I was thinking along the lines of just

Test npyv_ifadd_##SFX and npyvv_ifsub_##SFX

That way it is easy to grep the tests and header files for coverage without needing too much added prose in the test

something like that:

def test_mask_conditional(self): """ Test the following intrinsics: - npyv_ifadd_u8, npyv_ifadd_s8, npyv_ifadd_u16, npyv_ifadd_s16, npyv_ifadd_u32, npyv_ifadd_s32, npyv_ifadd_u64, npyv_ifadd_s64, npyv_ifadd_f32, npyv_ifadd_f64 - npyv_ifsub_u8, npyv_ifsub_s8, npyv_ifsub_u16, npyv_ifsub_s16, npyv_ifsub_u32, npyv_ifsub_s32, npyv_ifsub_u64, npyv_ifsub_s64, npyv_ifsub_f32, npyv_ifsub_f64 """

No, I meant exactly npyv_ifadd_##SFX since that is the grep-able macro name. The name npyv_ifadd_u8 does not appear in the source code.

The name npyv_ifadd_u8 does not appear in the source code.

Because its an inline function generated by C macro

defined in avx512/maskop.h and emulate_maskop.h

I think a docstring that mentions the actual C macro npyv_ifadd_##SFX would be more helpful in linking the test to the code it is testing.

I literally followed your suggestion but I don't think it would for a useful coverage test. I think there's a possibility to generate _simd.dispatch.c.src directly from the NPYV headers including the doc str. Is that would be a good idea?

Maybe in a future PR. I think I would still like the resulting generated test code to be checked in, so we can read it. Otherwise it will be very painful to track down errors in tests.

numpy/core/tests/test_simd.py

- add bitwise logical operations for boolean vectors - add round conversion for float vectors - add NAN test for float vectors - add conditional addition and subtraction - add #definition NPY_SIMD_FMA3 to check FUSED native support - add testing cases for all of the above

mattip · 2020-12-22T20:59:48Z

Thanks @seiko2plus

seiko2plus mentioned this pull request Nov 17, 2020

SIMD: Replace raw SIMD of sin/cos with NPYV(universal intrinsics) #17587

Merged

5 tasks

seiko2plus force-pushed the npyv_new_intrinsic_pk1 branch 2 times, most recently from b024a59 to cea4492 Compare December 14, 2020 01:25

seiko2plus commented Dec 15, 2020

View reviewed changes

seiko2plus force-pushed the npyv_new_intrinsic_pk1 branch from cea4492 to 66e8db7 Compare December 18, 2020 05:01

seiko2plus commented Dec 18, 2020

View reviewed changes

numpy/core/tests/test_simd.py Outdated Show resolved Hide resolved

seiko2plus commented Dec 18, 2020

View reviewed changes

numpy/core/tests/test_simd.py Outdated Show resolved Hide resolved

seiko2plus commented Dec 18, 2020

View reviewed changes

numpy/core/tests/test_simd.py Outdated Show resolved Hide resolved

seiko2plus force-pushed the npyv_new_intrinsic_pk1 branch from 66e8db7 to d7a183e Compare December 18, 2020 05:14

seiko2plus force-pushed the npyv_new_intrinsic_pk1 branch from d7a183e to 150d459 Compare December 22, 2020 20:35

mattip merged commit 3b39031 into numpy:master Dec 22, 2020

seiko2plus mentioned this pull request Dec 26, 2020

ENH: libdivide for unsigned integers #18055

Closed

rgommers added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Jul 12, 2022

Uh oh!

ENH, SIMD: Add new NPYV intrinsics pack(1) #17790

ENH, SIMD: Add new NPYV intrinsics pack(1) #17790

Conversation

seiko2plus commented Nov 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ENH, SIMD: Add new NPYV intrinsics pack(1)

Uh oh!

seiko2plus commented Dec 14, 2020

Uh oh!

mattip commented Dec 14, 2020

Uh oh!

seiko2plus commented Dec 15, 2020

Uh oh!

mattip commented Dec 15, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seiko2plus Dec 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattip commented Dec 22, 2020

Uh oh!

Uh oh!

seiko2plus commented Nov 17, 2020 •

edited

Loading

seiko2plus Dec 18, 2020 •

edited

Loading