np.repeat not accepting np.uint for repeats #15965

jeras · 2020-04-13T08:54:27Z

Repeat only accepts int or int array for repeats, np.uint is not accepted. But the function still checks if repeats are negative. I am working with nonegative values this becomes an unnecessary conversion step.

Reproducing code example:

import numpy as np
x = np.array([1,2,3,4])
r = np.array([1,0,0,2], dtype=np.uint)
x.repeat(r)

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-9ac14301bd8a> in <module>
      2 x = np.array([1,2,3,4])
      3 r = np.array([1,0,0,2], dtype=np.uint)
----> 4 x.repeat(r)

TypeError: Cannot cast array data from dtype('uint64') to dtype('int64') according to the rule 'safe'

Numpy/Python version information:

1.17.4 3.8.2 (default, Mar 13 2020, 10:14:16) 
[GCC 9.3.0]

The text was updated successfully, but these errors were encountered:

eric-wieser · 2020-04-13T09:28:34Z

I think this is by design, things like indices / strides / repeat numbers are expected to be of type intp, and anything else is cast. In this case, the casting is "safe", meaning it refuses to cast uint64 to int64 as it would corrupt values >= 2**63.

Performing the cast yourself with a less strict casting rule (such as .astype(np.intp)) would solve your issue, while performing the same number of casts.

jeras · 2020-04-13T09:56:24Z

I can't comment on Python/Numpy design preferences, I just thought uint could be used for repeats, this would have to bypass the check for negative values, and there would be no need for casting, since uint64 is already the same (on many architectures) as the uintptr_t from stdint.h.

I already used .astype(np.int) for casting and I will consider using 'intp`.

Feel free to close this issue.

eric-wieser · 2020-04-13T10:24:39Z

You should never use np.int, as it's just a really bad spelling of int (#6103). If you mean the C long type, use np.int_ (which is the complement to np.uint, the C unsigned long type).

Internally, numpy has chosen to use intptr_t (np.intp) almost everywhere, staying away from uintptr_t (np.uintp). This is partially motivated by the need for negative strides, but probably mostly just for simplicity.

I don't want to close this issue just yet, I agree your use-case is fairly reasonable.

this would have to bypass the check for negative values

Right, but you wouldn't actually get an efficiency gain here because instead we'd need to perform a check that all values are less than 2**63 before casting internally to intp.

seberg · 2020-04-13T14:27:06Z

In most of these places we do use same kind casting. That is arguably incorrect of course, but unfortunately not being able to index e.g. with a uint64 would be strange as well. The only fix would be to have a whole code path(s) around bounds-checking for different integer types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

np.repeat not accepting np.uint for repeats #15965

np.repeat not accepting np.uint for repeats #15965

jeras commented Apr 13, 2020

eric-wieser commented Apr 13, 2020

Uh oh!

jeras commented Apr 13, 2020

Uh oh!

eric-wieser commented Apr 13, 2020

Uh oh!

seberg commented Apr 13, 2020

Uh oh!

Uh oh!

np.repeat not accepting np.uint for repeats #15965

np.repeat not accepting np.uint for repeats #15965

Comments

jeras commented Apr 13, 2020

Reproducing code example:

Error message:

Numpy/Python version information:

eric-wieser commented Apr 13, 2020

Uh oh!

jeras commented Apr 13, 2020

Uh oh!

eric-wieser commented Apr 13, 2020

Uh oh!

seberg commented Apr 13, 2020

Uh oh!