Thanks to visit codestin.com
Credit goes to github.com

Skip to content

False positives for warning about writing to broadcast array in cython code #13929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mhvk opened this issue Jul 7, 2019 · 23 comments · Fixed by #14030
Closed

False positives for warning about writing to broadcast array in cython code #13929

mhvk opened this issue Jul 7, 2019 · 23 comments · Fixed by #14030

Comments

@mhvk
Copy link
Contributor

mhvk commented Jul 7, 2019

In astropy, we're getting loads of DeprecationWarning about possibly writing to a broadcast array from cythonized code. They all seem false positives, and may rather reflect something in Cython which I do not understand. But it can be reproduced with this trivial code below. For our particular case, I think this could be solved if one ensured the warning flag was only set if there was actual memory overlap inside the broadcast array.

Sample file broadcast.pyx

import warnings
warnings.filterwarnings('error')

import numpy as np
cimport numpy as np


t, _ = np.broadcast_arrays(np.ones(10), np.array(1.))

# Real code also has the following to ensure that broadcast arrays are copied.
# But that does not make a difference for the error, since the strides of `t` are fine.
# t = np.asarray(t, order='C')

ctypedef np.float64_t DTYPE_t


cdef check(DTYPE_t[::1] t):
    pass


print(t.strides)
check(t)

Corresponding setup.py:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("broadcast.pyx")
)

Compile and run with:

python3 setup.py build_ext --inplace
python3 -c 'import broadcast'
@charris
Copy link
Member

charris commented Jul 7, 2019

What cython version?

@mhvk
Copy link
Contributor Author

mhvk commented Jul 7, 2019

For the astropy tests, 0.29.10, for my tests, 0.29.2.

I checked, and the code above still gives the deprecation warning for 0.29.12 (newest with pip)

@charris
Copy link
Member

charris commented Jul 7, 2019

Hmm, hard to track, but it looks like cython is setting the buffer to writable.

@charris
Copy link
Member

charris commented Jul 7, 2019

Or maybe just checking for writable.

__pyx_t_6 = __Pyx_PyObject_to_MemoryviewSlice_dc_nn___pyx_t_9broadcast_DTYPE_t(__pyx_t_2, PyBUF_WRITABLE); if (unlikely(!__pyx_t_6.memview)) __PYX_ERR(2, 22, __pyx_L1_error)

@charris
Copy link
Member

charris commented Jul 7, 2019

So maybe a Cython bug, but probably not one we can track down for 1.17.0.

@charris
Copy link
Member

charris commented Jul 7, 2019

Cython is requesting a writable buffer, and in this case I think NumPy should provide same. However, I wonder if cython handles read only arrays correctly?

@charris
Copy link
Member

charris commented Jul 7, 2019

This fixes the example program

t, _ = np.broadcast_arrays(np.ones(10), np.array(1.))
t.flags.writeable = 1

@mhvk
Copy link
Contributor Author

mhvk commented Jul 7, 2019

Indeed, when I explicitly set t.flags.writeable = False, I get

ValueError: buffer source array is read-only

so, you're right, cython expects its memoryview to be writeable.

Apparently, one should be able to avoid that in more recent cython (0.28) by explicitly declaring them const (https://github.com/cython/cython/pull/1869/files).

And this I can make to work, but only if I explicitly set my array as non-writeable (see below). Apparently, unless the array is explicitly set as non-writeable, cython will check it is writeable even if one is taking a readonly view. Which is, of course, a bit odd.

From the numpy view, I'm not sure what would be best to work-around this...

import warnings
warnings.filterwarnings('error')

import numpy as np
cimport numpy as np


t, _ = np.broadcast_arrays(np.ones(10), np.array(1.))

# Real code also has the following to ensure that broadcast arrays are copied.
# But that does not make a difference for the error, since the strides of `t` are fine.
# t = np.asarray(t, order='C')

ctypedef np.float64_t DTYPE_t

cdef const DTYPE_t[::1] tc
t.setflags(write=False)


cdef check(const DTYPE_t[::1] t):
    pass


print(t.strides)
check(t)

@charris
Copy link
Member

charris commented Jul 7, 2019

More

In [2]: t, _ = np.broadcast_arrays(np.ones(10), np.array(1.))                   

In [3]: t.flags                                                                 
Out[3]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True  (with WARN_ON_WRITE=True)
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [4]: t.flags.writeable = 1                                                   

In [5]: t.flags                                                                 
Out[5]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

@charris
Copy link
Member

charris commented Jul 7, 2019

Making a copy also fixes the problem as claimed.

In [5]: t, _ = np.broadcast_arrays(np.ones(10), np.array(1.))                   

In [6]: t.copy().flags                                                          
Out[6]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

@mhvk
Copy link
Contributor Author

mhvk commented Jul 7, 2019

@charris - your approach is simpler, but sort-of beats the point: I'm getting a warning that I may be writing, which is indeed dangerous, so I should not be just make the array writeable!

I think the problem occurs because cython checks whether the array is writeable even if that property is explicitly not needed. But of course it is not unreasonable to check! And since we emit the warning basically on accessing that WRITEABLE flag, one has a problem... Ideally, of course, we would warn on actual writing...

@mhvk
Copy link
Contributor Author

mhvk commented Jul 7, 2019

Actually, a simpler "solution" may be to add to the warning text that explicitly setting WRITEABLE to False can also be useful, if the subsequent code is simply checking whether an array is writeable, but not necessarily doing anything.

@charris
Copy link
Member

charris commented Jul 7, 2019

@mhvk Looking at astropy/astropy#8965, I wonder if a better route might be to just skip the deprecation warning and go straight to making the array readonly.

CC: @mattip @eric-wieser

@seberg
Copy link
Member

seberg commented Jul 7, 2019

Hmm, we could remove the warning from the flags, but then the deprecation warning is not too helpful, but I agree. If cython checks the flags upon conversion (and it must), then the whole deprecation may be tricky. If you add the const in cython, cython in principle would not have to check the flags though? Is there any chance we can make that work (and maybe delay the deprecation for one release?).

@charris
Copy link
Member

charris commented Jul 7, 2019

@seberg The problem seems to be that we are using a flag that cython doesn't recognize. Cython still has to check the flags for other things (I think), so my sense is that we should skip the deprecation step. That will make the fix easier. Sometimes a clean cut is the least painful operation.

@seberg
Copy link
Member

seberg commented Jul 7, 2019

The actual problem here is our own code I think:

* If a read-only buffer is requested on a read-write array, we return a
* read-write buffer, which is dubious behavior. But that's why this call
* is guarded by PyArray_ISWRITEABLE rather than (flags &
* PyBUF_WRITEABLE).
*/
if (PyArray_ISWRITEABLE(self)) {
if (array_might_be_written(self) < 0) {
goto fail;
}

If we change that "dubious" behaviour, the warning can be silenced using the const attribute in cython (which seems more or less correct).
The second thing is whether t in this example should not have a warning, since its shape did not change (I somewhat thought this was the case, although I am not sure I mind that it has a warning).

@seberg
Copy link
Member

seberg commented Jul 7, 2019

Ah, the buffer protocol cannot indicate readonly specifically (just request writeable). So I suppose we could simply return a readonly buffer right away when the user does not ask for writable specifically. That jumps the deprecation a little, but is likely fine. The only thing I find a bit annoying about it, is that without the deprecation, it might be a bit hard to see why an array is readonly, but I suppose that cannot be helped.

@seberg
Copy link
Member

seberg commented Jul 7, 2019

So yeah, jumping the deprecation might be the best way and we can do that specific to the buffer protocol:

Patch to jump deprecation for the buffer protocol:
diff --git a/numpy/core/src/multiarray/buffer.c b/numpy/core/src/multiarray/buffer.c
index d8ad80266..4041b055f 100644
--- a/numpy/core/src/multiarray/buffer.c
+++ b/numpy/core/src/multiarray/buffer.c
@@ -771,17 +771,6 @@ array_getbuffer(PyObject *obj, Py_buffer *view, int flags)
             goto fail;
         }
     }
-    /*
-     * If a read-only buffer is requested on a read-write array, we return a
-     * read-write buffer, which is dubious behavior. But that's why this call
-     * is guarded by PyArray_ISWRITEABLE rather than (flags &
-     * PyBUF_WRITEABLE).
-     */
-    if (PyArray_ISWRITEABLE(self)) {
-        if (array_might_be_written(self) < 0) {
-            goto fail;
-        }
-    }
 
     if (view == NULL) {
         PyErr_SetString(PyExc_ValueError, "NULL view in getbuffer");
@@ -797,7 +786,14 @@ array_getbuffer(PyObject *obj, Py_buffer *view, int flags)
     view->buf = PyArray_DATA(self);
     view->suboffsets = NULL;
     view->itemsize = PyArray_ITEMSIZE(self);
-    view->readonly = !PyArray_ISWRITEABLE(self);
+    /*
+     * Set a requested buffer to readonly also if the array will be readonly
+     * after a deprecation. This jumps the deprecation, but avoiding the
+     * warning is not convenient here and a warning is given if a writeable
+     * buffer is requested.
+     */
+    view->readonly = (!PyArray_ISWRITEABLE(self) ||
+                      PyArray_CHKFLAGS(self, NPY_ARRAY_WARN_ON_WRITE));
     view->internal = NULL;
     view->len = PyArray_NBYTES(self);
     if ((flags & PyBUF_FORMAT) == PyBUF_FORMAT) {

I am not sure if that diff will force you to use the const specifiers, or if cython is doesn't request writeable unless it knows it will be written to or so.

@eric-wieser
Copy link
Member

It's possible we should just rollback the warning on accessing the writeable attribute, it doesn't provide a huge amount of value, and sounds like with cython it comes with a high cost.

@seberg: IMO forcing cython to use const for read-only arrays would be a really good idea, if there's a way to make that happen

@seberg
Copy link
Member

seberg commented Jul 8, 2019

My patch works for me anyway (in the sense that either const or setting arr.flags.writeable = False is sufficient). (although it segfaulted with const in cython cell magic, but I blame a cython/inline issue)

Cython does indeed force the const in this case (which also means it asks for writeable in all other cases). So: From the point of view of cython typed memoryviews (with new enough cython). My patch seems to actually achieve the exact deprecation we want. For other buffer users it could be a bit rougher in principle.

@mhvk
Copy link
Contributor Author

mhvk commented Jul 8, 2019

Thanks for looking into this! I like the possibility of just using const in cython.

@seberg
Copy link
Member

seberg commented Jul 8, 2019

I wonder if the we should add that const information into the warning. (Or maybe add it to the warning when the warning is issued from inside the buffer protocol). The only thing my code changes is that if you do not ask for a writeable buffer, we switch to a readonly buffer directly. Which is a change in behaviour, but maybe not bad enough to worry about it.

I can create a PR soon (although a few other things I would like to tick off the not-done list first). I suppose the next RC would tell us how things go.

EDIT: Of course if someone beats me to it, I am happy as well :)

@seberg seberg self-assigned this Jul 8, 2019
seberg added a commit to seberg/numpy that referenced this issue Jul 13, 2019
When a buffer interface does not request a writeable buffer,
simply pass a read-only one when the warn on write flag is set.

This is to give an easier way forward with avoiding the deprecation
warnings: Simply do not ask for a writeable buffer.

It will break code that expects writeable buffers but does not
ask for them specifically a bit harder than would be nice.
But since such code probably should ask for it specifically, this
is likely fine (an RC release has to find out).

The main reason for this is, that this way it plays very will with
cython, which requests writeable buffers explicitly and if declared
`const` is happy about read-only (so that using `const` is the best
way to avoid the warning and makes code cleaner).

Closes numpygh-13929, numpygh-13974
seberg added a commit to seberg/numpy that referenced this issue Jul 14, 2019
When a buffer interface does not request a writeable buffer,
simply pass a read-only one when the warn on write flag is set.

This is to give an easier way forward with avoiding the deprecation
warnings: Simply do not ask for a writeable buffer.

It will break code that expects writeable buffers but does not
ask for them specifically a bit harder than would be nice.
But since such code probably should ask for it specifically, this
is likely fine (an RC release has to find out).

The main reason for this is, that this way it plays very will with
cython, which requests writeable buffers explicitly and if declared
`const` is happy about read-only (so that using `const` is the best
way to avoid the warning and makes code cleaner).

Closes numpygh-13929, numpygh-13974
@mhvk
Copy link
Contributor Author

mhvk commented Jul 16, 2019

@seberg - thanks for the fix - it works well, and getting us to declare arrays we don't write to as const is good!

@mhvk mhvk closed this as completed Jul 16, 2019
seberg added a commit to seberg/numpy that referenced this issue Jul 17, 2019
When a buffer interface does not request a writeable buffer,
simply pass a read-only one when the warn on write flag is set.

This is to give an easier way forward with avoiding the deprecation
warnings: Simply do not ask for a writeable buffer.

It will break code that expects writeable buffers but does not
ask for them specifically a bit harder than would be nice.
But since such code probably should ask for it specifically, this
is likely fine (an RC release has to find out).

The main reason for this is, that this way it plays very will with
cython, which requests writeable buffers explicitly and if declared
`const` is happy about read-only (so that using `const` is the best
way to avoid the warning and makes code cleaner).

Closes numpygh-13929, numpygh-13974
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment