-
-
Notifications
You must be signed in to change notification settings - Fork 11k
ENH, SIMD: Add new NPYV intrinsics pack(1) #17790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b024a59
to
cea4492
Compare
ping @mattip |
Is there a way to get comments into the tests so it is clear which test is relating to which |
@mattip, we can add the last NPYV calls to the traceback but the current traceback is pretty good, it has the tested SIMD target, data types, and SIMD width. traceback sampleself = <numpy.core.tests.test_simd.Test_SIMD_FP32_256_FMA3__AVX2_f32 object at 0x7fbfc1eade20>
def test_conversions(self):
features = self._cpu_features()
if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
# very costly to emulate nearest even on Armv7
# instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
_round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
else:
_round = round
vdata_a = self.load(self._data())
vdata_a = self.sub(vdata_a, self.setall(0.5))
data_round = [_round(x) for x in vdata_a]
vround = self.round_s32(vdata_a)
> assert vround != data_round
E assert <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]> != [0, 2, 2, 4, 4, 6, ...]
_round = <built-in function round>
data_round = [0, 2, 2, 4, 4, 6, ...]
features = 'FMA3 AVX2'
self = <numpy.core.tests.test_simd.Test_SIMD_FP32_256_FMA3__AVX2_f32 object at 0x7fbfc1eade20>
vdata_a = <npyv_f32 of [0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]>
vround = <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]>
numpy/core/tests/test_simd.py:219: AssertionError
__________________________________________________________________________________ Test_SIMD_FP32_128_SSE42_f32.test_conversions __________________________________________________________________________________
self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_SSE42_f32 object at 0x7fbfc1be1e20>
def test_conversions(self):
features = self._cpu_features()
if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
# very costly to emulate nearest even on Armv7
# instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
_round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
else:
_round = round
vdata_a = self.load(self._data())
vdata_a = self.sub(vdata_a, self.setall(0.5))
data_round = [_round(x) for x in vdata_a]
vround = self.round_s32(vdata_a)
> assert vround != data_round
E assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]
_round = <built-in function round>
data_round = [0, 2, 2, 4]
features = 'SSE42'
self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_SSE42_f32 object at 0x7fbfc1be1e20>
vdata_a = <npyv_f32 of [0.5, 1.5, 2.5, 3.5]>
vround = <npyv_s32 of [0, 2, 2, 4]>
numpy/core/tests/test_simd.py:219: AssertionError
________________________________________________________________________________ Test_SIMD_FP32_128_baseline_f32.test_conversions _________________________________________________________________________________
self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_baseline_f32 object at 0x7fbfc1b56a90>
def test_conversions(self):
features = self._cpu_features()
if not self.npyv.simd_f64 and re.match(r".*(NEON|ASIMD)", features):
# very costly to emulate nearest even on Armv7
# instead we round halves to up. e.g. 0.5 -> 1, -0.5 -> -1
_round = lambda v: int(v + (0.5 if v >= 0 else -0.5))
else:
_round = round
vdata_a = self.load(self._data())
vdata_a = self.sub(vdata_a, self.setall(0.5))
data_round = [_round(x) for x in vdata_a]
vround = self.round_s32(vdata_a)
> assert vround != data_round
E assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]
_round = <built-in function round>
data_round = [0, 2, 2, 4]
features = 'SSE SSE2 SSE3'
self = <numpy.core.tests.test_simd.Test_SIMD_FP32_128_baseline_f32 object at 0x7fbfc1b56a90>
vdata_a = <npyv_f32 of [0.5, 1.5, 2.5, 3.5]>
vround = <npyv_s32 of [0, 2, 2, 4]>
numpy/core/tests/test_simd.py:219: AssertionError
============================================================================================= short test summary info =============================================================================================
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_256_FMA3__AVX2_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4, 4, 6, 6, 8]> != [0, 2, 2, 4, 4, 6, ...]
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_128_SSE42_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]
FAILED numpy/core/tests/test_simd.py::Test_SIMD_FP32_128_baseline_f32::test_conversions - assert <npyv_s32 of [0, 2, 2, 4]> != [0, 2, 2, 4]
|
sorry for not being clear. I meant adding some kind of comment or other notation in the test file itself to make it clear what intrinsics each test is checking. This will make it easier to review since the reviewer can see that the added passing test is aimed at testing a specific intrinsic or set of intrinsics. |
numpy/core/tests/test_simd.py
Outdated
def test_mask_conditional(self): | ||
vdata_a = self.load(self._data()) | ||
vdata_b = self.load(self._data(reverse=True)) | ||
true_mask = self.cmpeq(self.zero(), self.zero()) | ||
false_mask = self.cmpneq(self.zero(), self.zero()) | ||
|
||
data_sub = self.sub(vdata_b, vdata_a) | ||
ifsub = self.ifsub(true_mask, vdata_b, vdata_a, vdata_b) | ||
assert ifsub == data_sub | ||
ifsub = self.ifsub(false_mask, vdata_a, vdata_b, vdata_b) | ||
assert ifsub == vdata_b | ||
|
||
data_add = self.add(vdata_b, vdata_a) | ||
ifadd = self.ifadd(true_mask, vdata_b, vdata_a, vdata_b) | ||
assert ifadd == data_add | ||
ifadd = self.ifadd(false_mask, vdata_a, vdata_b, vdata_b) | ||
assert ifadd == vdata_b | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant adding some kind of comment or other notation in the test file itself to make it clear what intrinsics each test is checking.
please take a look here, we are here testing mask conditional operations on several SIMD extensions with different data types. in the same time, we count on another NPYV intrinsics for generating the testing data and args in order to shrink the amount of code.
in other words, the code here equivalent to the following code except it testing all vector data types on all supported SIMD targets. sweet isn't it?:
following code
from numpy.core._simd import targets
npyv = targets["baseline"]
vdata_a = npyv.load_u8(range(npyv.nlanes_u8))
vdata_b = npyv.load_u8(reversed(range(npyv.nlanes_u8)))
true_mask = npyv.cmpeq_u8(npyv.zero_u8(), npyv.zero_u8())
false_mask = npyv.cmpneq_u8(npyv.zero_u8(), npyv.zero_u8())
data_sub = npyv.sub_u8(vdata_b, vdata_a)
ifsub = npyv.ifsub_u8(true_mask, vdata_b, vdata_a, vdata_b)
assert ifsub == data_sub
ifsub = npyv.ifsub_u8(false_mask, vdata_a, vdata_b, vdata_b)
assert ifsub == vdata_b
data_add = npyv.add_u8(vdata_b, vdata_a)
ifadd = npyv.ifadd_u8(true_mask, vdata_b, vdata_a, vdata_b)
assert ifadd == data_add
ifadd = npyv.ifadd_u8(false_mask, vdata_a, vdata_b, vdata_b)
assert ifadd == vdata_b
So I wonder, How can we add notations or special marks? we can replace parm name self
to npyv
but would it help? I guess not. The only solution in my head right now is adding a docstring for each testing function clarifying the following:
- the place where these intrinsics are defined in the source
- the signatures of each one of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start with a hand written docstring that just mentions "testing cmpeq_u8
, cmpneq_u8
, ..." for every new test in this PR. No need to do more than that now. Then the reviewers can see which intrinsics have been added and that there are tests that cover them.
cea4492
to
66e8db7
Compare
numpy/core/tests/test_simd.py
Outdated
Conditional addition and subtraction for all supported data types. | ||
Samples: | ||
npyv_s32 npyv_ifadd_s32(npyv_b32 mask, npyv_s32 a, npyv_s32 b, npyv_s32 c) -> | ||
mask ? a + b : c | ||
npyv_f64 npyv_ifsub_f64(npyv_b64 mask, npyv_f64 a, npyv_f64 b, npyv_f64 c) -> | ||
mask ? a - b : c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattip, Is that okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking along the lines of just
Test npyv_ifadd_##SFX and npyvv_ifsub_##SFX
That way it is easy to grep the tests and header files for coverage without needing too much added prose in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like that:
def test_mask_conditional(self):
"""
Test the following intrinsics:
- npyv_ifadd_u8, npyv_ifadd_s8, npyv_ifadd_u16, npyv_ifadd_s16, npyv_ifadd_u32,
npyv_ifadd_s32, npyv_ifadd_u64, npyv_ifadd_s64, npyv_ifadd_f32, npyv_ifadd_f64
- npyv_ifsub_u8, npyv_ifsub_s8, npyv_ifsub_u16, npyv_ifsub_s16, npyv_ifsub_u32,
npyv_ifsub_s32, npyv_ifsub_u64, npyv_ifsub_s64, npyv_ifsub_f32, npyv_ifsub_f64
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I meant exactly npyv_ifadd_##SFX
since that is the grep-able macro name. The name npyv_ifadd_u8
does not appear in the source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name npyv_ifadd_u8 does not appear in the source code.
Because its an inline function generated by C macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined in avx512/maskop.h and emulate_maskop.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a docstring that mentions the actual C macro npyv_ifadd_##SFX
would be more helpful in linking the test to the code it is testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I literally followed your suggestion but I don't think it would for a useful coverage test. I think there's a possibility to generate _simd.dispatch.c.src directly from the NPYV headers including the doc str. Is that would be a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in a future PR. I think I would still like the resulting generated test code to be checked in, so we can read it. Otherwise it will be very painful to track down errors in tests.
66e8db7
to
d7a183e
Compare
- add bitwise logical operations for boolean vectors - add round conversion for float vectors - add NAN test for float vectors - add conditional addition and subtraction - add #definition NPY_SIMD_FMA3 to check FUSED native support - add testing cases for all of the above
d7a183e
to
150d459
Compare
Thanks @seiko2plus |
ENH, SIMD: Add new NPYV intrinsics pack(1)
required by #17587
merge after #17789
TODO: