Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: umath: ensure ufuncs are well-defined with memory overlapping inputs #8043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Feb 16, 2017

Conversation

pv
Copy link
Member

@pv pv commented Sep 11, 2016

Make ufuncs and ufunc operations to make temporary copies if input and output operands overlap in memory. Currently (cf. gh-6272, gh-1683) output from such operations is undefined. With this PR, the result is the same as if the operands were non-overlapping.

Fixes gh-1683

Adds new flags to NpyIter, and one new API function is added to deal with MapIter.

This PR does not fix up overlap handling in array assignment, that needs to be done later (and is slightly more hairy --- some common overlapping cases are in fact already handled). Unlike adding overlap-if-copy flag to NpyIter, it appears a similar flag would not be as useful in MapIter, as there are many special-cased code paths, and MapIter is not always made aware of all arrays operated on.

  • reread whole patch
  • re-check test coverage, and whether some of the tests are unnecessary

PyArray_ITEMSIZE(arr)))

static NPY_INLINE int
PyArray_EQUIVALENTLY_ITERABLE_OVERLAP_OK(PyArrayObject *arr1, PyArrayObject *arr2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems rather large for inline ;)

Copy link
Member Author

@pv pv Jan 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, I don't really want to put it in a .c file since the logic is tightly coupled to the macros here (and I suspect the multiarray/umath split adds to the mess). Removing the inline makes it emit compiler warnings. I could write this as a macro, but that's messy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ beckons ;)

@@ -4,9 +4,11 @@
import itertools

import numpy as np
from numpy.testing import run_module_suite, assert_, assert_raises, assert_equal
from numpy.testing import (run_module_suite, assert_, assert_raises, assert_equal, assert_array_equal,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line too long ?

@@ -344,6 +344,8 @@
# End 1.9 API
'PyArray_CheckAnyScalarExact': (300, NonNull(1)),
# End 1.10 API
# End 1.11 API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should update this now that 1.12 is branched. The cversion and C-API number also need updating, see documentation in doc/HOWTO_RELEASE.rst.txt.

@charris
Copy link
Member

charris commented Nov 6, 2016

Just starting to look at this.

NpyIter_Deallocate(iter);
return -1;
}
Py_DECREF(op_tmp);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I look at this again, I guess op_tmp should be placed back into op instead of being discarded.


In the memory overlap checks done when ``NPY_ITER_COPY_IF_OVERLAP``
is specified, consider this array as overlapping even if it is
exactly the same as another array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?. What other array? I think the operation here could be best explained with the help of some diagrams showing the data access and writebacks. The operation can be performed, and is valid, even if there is no overlap, just that it isn't necessary if no overlap.

Copy link
Member

@charris charris Dec 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be help clarify behavior when the indexing is many to one in either array.

operation is the same as if all operands were copied.
In cases where copies would need to be made, **the result of the
computation may be undefined without this flag!**

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for two operands? For inplace operations, both inputs are read, but only one is written too. Or does this apply for three operands, two inputs and one output. Or is it more general than that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It applies to any number of operands.

* @returns 1 if memory overlap found, 0 if not.
*/
NPY_NO_EXPORT int
index_has_memory_overlap(PyArrayObject *self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so you are actually checking indexes rather than the resulting view?

Copy link
Member Author

@pv pv Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to checking the overlap between the array operated on, and the index array used in ufunc.at, for example

x = np.arange(100)
np.invert.at(x[::-1], x)  # <- crazy stuff if no x.copy() of the second argument

It doesn't look at the content of the indices, so it's a somewhat conservative check.

@charris
Copy link
Member

charris commented Dec 6, 2016

Just a first read through. This will need a release note eventually.

@charris
Copy link
Member

charris commented Dec 6, 2016

Probably should bring in @seberg on this also.

Test ufunc call memory overlap handling
"""

def check_unary_fuzz(self, operation, get_out_axis_size, dtype=np.int16,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to check again if these tests cover all places where the copyifoverlap behavior is in, and/or whether some of the tests are unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, checked --- removing any of NPY_COPY_IF_OVERLAP et al. causes some test to fail

@@ -3635,6 +3654,7 @@ PyUFunc_Reduceat(PyUFuncObject *ufunc, PyArrayObject *arr, PyArrayObject *ind,
/* In case COPY or UPDATEIFCOPY occurred */
op[0] = NpyIter_GetOperandArray(iter)[0];
op[1] = NpyIter_GetOperandArray(iter)[1];
op[2] = NpyIter_GetOperandArray(iter)[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this line added? a bug in the old code or due to the other changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a noop. Not necessary to add.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I think it's useful to have here to ensure correctness, if we later on change NPY_ITER_COPY_IF_OVERLAP to make copies of also the input arrays.

pv added 11 commits January 19, 2017 22:12
Add a new NPY_ITER_COPY_IF_OVERLAP iterator flag to NpyIter, which
instructs it to check if read operands overlap with write operands in
memory, and make temporary copies to eliminate detected overlap.

Thanks to Sebastian Berg.
These can differ if the iterator decides to make temporary copies of
output arrays.
…ABLE loops

These loops iterate over whole arrays in "trivial" order, so that it is
possible to reason about the data dependency.
After this commit, all code paths in PyUFunc_GeneralizedFunction and
PyUFunc_GenericFunction perform input/output operand memory overlap
detection, and make copies to eliminate it.
…c_loops

Because the wheremask loop does not write to all elements of the array, it
needs to consider output arrays as READWRITE, in case COPY_IF_OVERLAP makes
temporary copies.

As the iterator may make temporary copies of arrays, make sure to pass those to
__array_prepare__ instead of the originals.
This adds a new method PyArray_MapIterArrayCopyIfOverlap to the API.
@njsmith
Copy link
Member

njsmith commented Jan 23, 2017

Oh, right, now I understand! ...this rings a bell, too. @seberg, weren't you telling me at some point about how there's some code inside numpy that relies on implementing reducations via zero-strided calls to the non-reduce loops? Would that be affected by this? (My guess is that whatever code is using this trick is "below" the overlap checking in the stack so unaffected by it, and I'm totally happy with this not being something we guarantee to end-users. Also hopefully no-one is writing horrible as_strided expressions as a "shorthand" instead of calling np.sum!)

@pv
Copy link
Member Author

pv commented Jan 23, 2017 via email

@charris
Copy link
Member

charris commented Jan 29, 2017

@pv Do you feel that this is about ready?

@pv
Copy link
Member Author

pv commented Feb 6, 2017

@charris: I've read this several times through now, and only finding forgotten decref on error handling. I'm fairly confident now that it's what it says on the tin, and doesn't do anything more.

It's a bit ugly that this adds PyArray_MapIterArrayCopyIfOverlap to the API --- it's essentially a private function, as was PyArray_MapIterArray --- but the better way is not obvious. On the other hand, adding the new iterator flags appears to be a good way to do this, as it makes reasoning about the change easier, and the changes in ufuncs themselves are mostly minimal. Currently, the copying is somewhat conservative in that perhaps some could be avoided (e.g. in accumulate) and the heuristics in deciding what to copy is a bit primitive, but that's something that in principle can be tightened up later.

@pv
Copy link
Member Author

pv commented Feb 6, 2017

One additional thing is that this probably won't work on PyPy due to the use of updateifcopy, but I guess that's part of a bigger pypy issue that needs separate fixes (probably sprinkling PyArray_FlushUpdateIfCopy around) in more places than just this.

@pv pv mentioned this pull request Feb 8, 2017
5 tasks
@charris
Copy link
Member

charris commented Feb 8, 2017

@mattip @rlamy Comments?

@rlamy
Copy link
Contributor

rlamy commented Feb 9, 2017

Well, I don't really understand the UPDATEIFCOPY issue, but since @mattip is on holiday, I'll comment anyway.

All the new tests appear to pass on both pypy and pypy3. I'm guessing that's because the copy cannot be seen by the RPython GC. So I think this is OK for us.

@pv
Copy link
Member Author

pv commented Feb 9, 2017 via email

if (index_type & (HAS_FANCY | HAS_BOOL)) {
for (i = 0; i < num; ++i) {
if (indices[i].object != NULL && PyArray_Check(indices[i].object) &&
solve_may_share_memory(self, (PyArrayObject *)indices[i].object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have extra indent here to separate the condition from the block.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit tricky with the long line, but it doesn't read well as is.

}

if (extra_op != NULL && PyArray_Check(extra_op) &&
solve_may_share_memory(self, (PyArrayObject *)extra_op, 1) != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@@ -3197,6 +3259,21 @@ PyArray_MapIterArray(PyArrayObject * a, PyObject * index)
}


/*NUMPY_API
*
* Use advanced indexing to iterate an array. Please note
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be documented somewhere public. It seems dangerous to expose an API that may change...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could someone detect the change to this particular function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This the is old comment for PyArray_MapIterArray, not added by me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, removing it, since it's unlikely we will change this...

@@ -2708,6 +2710,80 @@ npyiter_allocate_arrays(NpyIter *iter,
bufferdata = NIT_BUFFERDATA(iter);
}

if (flags & NPY_ITER_COPY_IF_OVERLAP) {
/* Perform operand memory overlap checks if requested */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add more detail on what is being checked here, the notes below are a bit too local.

* make sense to go further.
*/
may_share_memory = solve_may_share_memory(
op[iop], op[iother], 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extra indent here, or even break the args and align.

return 1;
}

/* Arrays overlapping in memory may be equivalently iterable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiline comment not correctly formatted ;) Also below.

}
else if (stride1 < 0) {
arr1_ahead = (stride1 <= stride2 &&
(npy_uintp)PyArray_BYTES(arr1) <= (npy_uintp)PyArray_BYTES(arr2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Out of curiousity, did the compiler complain without the cast?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The casts are not needed (probably leftovers from earlier versions).

@@ -1730,12 +1764,17 @@ execute_fancy_ufunc_loop(PyUFuncObject *ufunc,
}
}
for (i = nin; i < nop; ++i) {
/*
* Because we don't write to all elements, the output arrays
* must be considered READWRITE by the iterator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updateifcopy copies all elements back, not only those for which wheremask is true, so the old content of the output array must be copied there.

@charris
Copy link
Member

charris commented Feb 10, 2017

Not done reviewing yet. Determining when a copy isn't needed for overlapping arrays seems tricky except in single operand the linear case. ISTR that the iterator may coalesce axes.

pv added 2 commits February 10, 2017 20:49
It's unlikely we will break backward compatibility, so better
not claim we will do it in the code.
@charris charris merged commit d55f40b into numpy:master Feb 16, 2017
@charris
Copy link
Member

charris commented Feb 16, 2017

I'm going to put this in so it will start getting more testing. I don't claim to have understood all the details, but I doubt I will discover any errors by inspection. The write up in the release notes is very nice, you should consider copying that over to the official documentation. There are still some style nits that I may attempt to clean up later, I would gently urge you to pay more attention to spaces around operators ('+') and line length, even though the latter may not always be as readable.

Thanks for this work, Pauli.

@seberg
Copy link
Member

seberg commented Feb 17, 2017

Wow, cool. Assuming nothing goes haywire, this is the next huge addition :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Surprising results from in-place operations involving views (Trac #1085)
7 participants