BUG: fix uint alignment asserts in lowlevel loops #12626

ahaldane · 2018-12-28T03:16:55Z

Further correction to the debug assert statements in lowlevel_stride_loops.c.src to account for uint alignment, see #12618.

This also updates the unit test so it always fails if the alignment is incorrectly calculated, instead of sporadically failing depending on what malloc gives. That's done by making _aligned_zeros align to the requested alignment yet not twice the alignment.

The particular case that was failing was for 16-byte longdouble, which is 8-byte "uint aligned" but 16-byte "true aligned". (The copy-code copies 16-byte types with two uint64 assignments). So an 8-byte-aligned ptr would go into the uint aligned copy code, but would trip the 16-byte assert statement.

ahaldane · 2018-12-28T06:06:17Z

Hmm the updated test caught some bugs...

charris · 2018-12-28T13:43:29Z

Hmm the updated test caught some bugs...

Good test :)

charris · 2018-12-28T16:25:52Z

At some point we should experiment with not copying using ints. I think the original need was to work around gcc memcpy/memmov etc. that used to be performance disasters and supposedly fixed some years ago. With SIMD et al, it may be that compilers can do a better job than we can these days.

mattip · 2018-12-31T13:55:11Z

The 32-bit failing test hits the assert on line 812 in _aligned_cast_bool_to_longdouble, when running numpy/core/tests/test_einsum.py::TestEinsum::test_einsum_sums_longdouble

Here is the stack trace from a non-debug python so I do not get the nice python stack arguments.

#3  0xf7e1bd07 in __assert_fail_base (fmt=0xf7f56258 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0xf7966618 "N == 0 || npy_is_aligned(dst, _UINT_ALIGN(_TYPE2))", 
    file=0xf796645c "numpy/core/src/multiarray/lowlevel_strided_loops.c.src", line=812, 
    function=0xf79675e0 <__PRETTY_FUNCTION__.17102> "_aligned_cast_bool_to_longdouble") at assert.c:92
#4  0xf7e1bd8b in __GI___assert_fail (assertion=0xf7966618 "N == 0 || npy_is_aligned(dst, _UINT_ALIGN(_TYPE2))", 
    file=0xf796645c "numpy/core/src/multiarray/lowlevel_strided_loops.c.src", line=812, 
    function=0xf79675e0 <__PRETTY_FUNCTION__.17102> "_aligned_cast_bool_to_longdouble") at assert.c:101
#5  0xf77ef5e9 in _aligned_cast_bool_to_longdouble (dst=0xffff7938 "\354\201\377\377ln\365\367\002", dst_stride=0, 
    src=0xffff7a2b "", src_stride=0, N=1, __NPY_UNUSED_TAGGEDsrc_itemsize=1, __NPY_UNUSED_TAGGEDdata=0x0)
    at numpy/core/src/multiarray/lowlevel_strided_loops.c.src:812
#6  0xf77aac6c in PyArray_CastRawArrays (count=1, src=0xffff7a2b "", dst=0xffff7938 "\354\201\377\377ln\365\367\002", src_stride=0, 
    dst_stride=0, src_dtype=0xf7a2cd40 <BOOL_Descr>, dst_dtype=0xf7a2c9c0 <LONGDOUBLE_Descr>, move_references=0)
    at numpy/core/src/multiarray/dtype_transfer.c:3785
#7  0xf7777cc4 in PyArray_AssignRawScalar (dst=0xf4ab1ac0, src_dtype=0xf7a2cd40 <BOOL_Descr>, src_data=0xffff7a2b "", wheremask=0x0, 
    casting=NPY_SAFE_CASTING) at numpy/core/src/multiarray/array_assign_scalar.c:248
#8  0xf7784781 in PyArray_AssignZero (dst=0xf4ab1ac0, wheremask=0x0) at numpy/core/src/multiarray/convert.c:539
#9  0xf77be7ea in PyArray_EinsteinSum (subscripts=<optimized out>, nop=1, op_in=0xffff929c, dtype=0x0, order=NPY_KEEPORDER, 
    casting=NPY_SAFE_CASTING, out=0x0) at numpy/core/src/multiarray/einsum.c.src:2772
#10 0xf7813b3c in array_einsum (__NPY_UNUSED_TAGGEDdummy=0xf7a76e64, args=0xf4eb7fac, kwds=0xf4ac810c)
    at numpy/core/src/multiarray/multiarraymodule.c:2663

mattip · 2018-12-31T20:15:09Z

numpy/core/src/multiarray/lowlevel_strided_loops.c.src

-    assert(N == 0 || npy_is_aligned(dst, _ALIGN(_TYPE2)));
-#  endif
+    assert(N == 0 || npy_is_aligned(src, _UINT_ALIGN(_TYPE1)));
+    assert(N == 0 || npy_is_aligned(dst, _UINT_ALIGN(_TYPE2)));


This now fails on npy_longdouble on 32 bit, where sizeof(npy_longdouble) == 12.
Shouldn't this be _UINT_ALIGN(dtype->alignment), not _UINT_ALIGN(dtype->elsize)?

As written, this can never succeed when sizeof(_TYPE2) == 12

Edit: add qualifier sizeof

charris · 2018-12-31T21:41:03Z

@ahaldane We have continuing failures in both the master branches and 1.16.x due to merging the earlier fix. It would be good to get this finished.

charris · 2019-01-01T21:42:50Z

See also #12638.

ahaldane · 2019-01-02T16:51:58Z

I'll get to it tonight, I have the fix mostly done from a few days ago, just need to check it over.

ahaldane · 2019-01-03T05:45:38Z

All right, should be ready for review now.

I now realized that any arrays which want to use the "aligned" code-paths in the lowlevel loops must be both "uint" and "true" aligned. This is because casting paths need true alignment, and copy paths need uint alignment, but the dtype-dispatch funcs (PyArray_GetDtypeTransferFunction) don't know which one we will go down yet when we supply the aligned flag.

So the solution here is to make the alignment flag be computed using both types of alignment on those cases. I also made the assert statements in the lowlevel loops check for the appropriate kind of alignment, depending on the next few lines after the assert.

I grepped all the source so I think I got all the places that need to check both. There are a few places that only need to check one or the other, eg the mapiter_* functions only need uint alignment so I left some IsUintAligned checks alone (not visible in diff).

On my system, it passes tests with USE_DEBUG both on x64 and in a x86 chroot. The updated test here should be more thorough and deterministic than before, too. I also documented some cases where uint alignment is larger than true alignment and vice-versa, which are useful for sanity checking.

mattip · 2019-01-03T06:02:36Z

numpy/core/tests/test_multiarray.py

+
+    The ndarray is guranteed *not* aligned to twice the requested alignment.
+    Eg, if align=4, guarantees it is not aligned to 8. If align=None uses
+    dtype.alignment."""


"guranteed" spelling.

mattip · 2019-01-03T06:04:03Z

LGTM. Newly allocated ndarrays will always be both uint aligned and true aligned, correct?

ahaldane · 2019-01-03T06:07:14Z

Only true alignment is guaranteed, as uint alignment depends on the itemsize. Arrays whose itemsize is not 2,4,8,16 count as "not uint aligned", and will generally go down the unaligned code paths.

charris · 2019-01-03T15:01:37Z

numpy/core/src/multiarray/nditer_constr.c

-            /* New arrays are aligned and need no cast */
-            op_itflags[iop] |= NPY_OP_ITFLAG_ALIGNED;
+            /*
+             * New arrays are guranteed true-aligned, but copy/cast code


"guranteed" spelling.

charris · 2019-01-03T15:02:20Z

numpy/core/src/multiarray/nditer_constr.c

@@ -2888,11 +2894,17 @@ npyiter_allocate_arrays(NpyIter *iter,
                    PyArray_DATA(op[iop]), NULL);

            /*
-             * New arrays are aligned need no cast, and in the case
+             * New arrays are guranteed true-aligned, but copy/cast code


"guranteed" spelling.

charris · 2019-01-03T15:02:45Z

numpy/core/src/multiarray/nditer_constr.c

-            /* The temporary copy is aligned and needs no cast */
-            op_itflags[iop] |= NPY_OP_ITFLAG_ALIGNED;
+            /*
+             * New arrays are guranteed true-aligned, but copy/cast code


"guranteed" spelling.

charris · 2019-01-03T15:03:08Z

numpy/core/tests/test_multiarray.py

+    """
+    Allocate a new ndarray with aligned memory.
+
+    The ndarray is guranteed *not* aligned to twice the requested alignment.


"guranteed" spelling.

ahaldane · 2019-01-03T17:09:12Z

Typos fixed.

charris · 2019-01-03T19:19:21Z

Great. Thanks Allan.

mhvk · 2019-01-06T16:02:25Z

doc/source/reference/alignment.rst

-alignments.
+Note that the strided-copy and strided-cast code are deeply intertwined and so
+any arrays being processed by them must be both uint and true aligned, even
+though te copy-code only needs uint alignment and the cast code only true


@ahaldane - in #12677 could you fix the type here (te -> the)

mhvk · 2019-01-06T16:05:26Z

numpy/core/src/multiarray/array_assign_array.c

    aligned = raw_array_is_aligned(ndim, shape, dst_data, dst_strides,
                                   npy_uint_alignment(dst_dtype->elsize)) &&
+              raw_array_is_aligned(ndim, shape, dst_data, dst_strides,


This is a belated comment, but this does seem rather inefficient: one should check the larger first and then the smaller only if it isn't an integer factor of the smaller.

Given how often this recurs, probably should have a routine that checks both...

ahaldane force-pushed the further_uint_align_fix branch from f9edc06 to 8cc9d9b Compare December 28, 2018 03:34

ahaldane added 00 - Bug component: numpy._core 09 - Backport-Candidate PRs tagged should be backported labels Dec 28, 2018

ahaldane added this to the 1.16.0 release milestone Dec 28, 2018

seberg mentioned this pull request Dec 31, 2018

BUG: Fix incorrect/missing reference cleanups found using valgrind #12624

Merged

9 tasks

mattip reviewed Dec 31, 2018

View reviewed changes

mattip mentioned this pull request Jan 1, 2019

BUG: disable flaky sanity checks #12637

Closed

charris mentioned this pull request Jan 1, 2019

BUG: several test errors on SPARC #12638

Open

4 tasks

ahaldane force-pushed the further_uint_align_fix branch 3 times, most recently from 848fa04 to 791c5f4 Compare January 3, 2019 05:25

mattip reviewed Jan 3, 2019

View reviewed changes

charris reviewed Jan 3, 2019

View reviewed changes

BUG: fix uint alignment asserts in lowlevel loops

812e359

ahaldane force-pushed the further_uint_align_fix branch from 791c5f4 to 812e359 Compare January 3, 2019 16:27

charris merged commit fd89a41 into numpy:master Jan 3, 2019

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jan 3, 2019

charris removed this from the 1.16.0 release milestone Jan 3, 2019

charris mentioned this pull request Jan 3, 2019

BUG: fix uint alignment asserts in lowlevel loops #12655

Merged

This was referenced Jan 5, 2019

TST: Fix endianness in unstuctured_to_structured test #12671

Merged

MAINT: Further fixups to uint alignment checks #12677

Merged

mhvk reviewed Jan 6, 2019

View reviewed changes

charris mentioned this pull request Jan 9, 2019

MAINT: Further fixups to uint alignment checks #12706

Merged

ahaldane mentioned this pull request Apr 22, 2019

Bug in function add.at (core dump) #13317

Open

Uh oh!

BUG: fix uint alignment asserts in lowlevel loops #12626

BUG: fix uint alignment asserts in lowlevel loops #12626

Uh oh!

Conversation

ahaldane commented Dec 28, 2018

Uh oh!

ahaldane commented Dec 28, 2018

Uh oh!

charris commented Dec 28, 2018

Uh oh!

charris commented Dec 28, 2018

Uh oh!

mattip commented Dec 31, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattip Dec 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented Dec 31, 2018

Uh oh!

charris commented Jan 1, 2019

Uh oh!

ahaldane commented Jan 2, 2019

Uh oh!

ahaldane commented Jan 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattip commented Jan 3, 2019

Uh oh!

ahaldane commented Jan 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahaldane commented Jan 3, 2019

Uh oh!

charris commented Jan 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mattip Dec 31, 2018 •

edited

Loading