MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

eric-wieser · 2019-04-26T08:04:45Z

There was some clumsy juggling of lengths in PyArray_IntpConverter where we ask for the length of the argument twice, and try to handle wrapping the object twice too.

This fixes some bugs that probably no one ever will or did see:

A corrupt pickle file with an shape array that is too long would be silently truncated
An object that returns different __len__s each time it is asked will no longer fail
IntpFromSequence would overflow it's output buffer if nd == 0

This collapses PyArray_IntpFromIndexSequence back into PyArray_IntpFromSequence, since the former no longer has any callers (and is internal)

eric-wieser · 2019-04-26T08:08:12Z

numpy/core/src/multiarray/conversion_utils.c

-            /*
-             * After the deprecation the PyNumber_Check could be replaced
-             * by PyIndex_Check.
-             * FIXME 1.9 ?


This is already handled within PyArray_IntpFromIndexSequence on line 131 - all this did was forbid objects that implement __index__ but neither __int__ nor __float__, and try to produce better error messages. The former was a waste of time, and the latter is now handled by catching TypeError and replacing the message on line 147

The deprecation this refers to was eliminated a long time ago.

seberg

Can't find anything to nitpick (should read the non-diff code once more at some point).

The only thing I find strange is that chunk of code in methods.c, but maybe I am missing things. Should we just deprecate that PyArray_MIN while we are at it (I do not care much either way)? The multiarray.c check would be unnecessary in the future then.

numpy/core/src/multiarray/methods.c

seberg · 2019-04-30T18:04:33Z

numpy/core/src/multiarray/conversion_utils.c

@@ -974,7 +986,30 @@ PyArray_IntpFromIndexSequence(PyObject *seq, npy_intp *vals, npy_intp maxvals)
 NPY_NO_EXPORT int
 PyArray_IntpFromSequence(PyObject *seq, npy_intp *vals, int maxvals)


I wonder if we should deprecate that maxvals thing. Well, not providing it, but using the minimum. If maxvals is smaller, it should really be an error and not just ignore things.

If we do that, we'd just deprecate this whole function in favor of the other one.

numpy/core/src/multiarray/conversion_utils.c

numpy/core/src/multiarray/methods.c

numpy/core/src/multiarray/conversion_utils.c

eric-wieser · 2019-05-03T05:00:42Z

The only thing I find strange is that chunk of code in methods.c, but maybe I am missing things

There are two things going on there:

We don't want to use PyArray_IntpFromSequence there because it has the bizarre min-length semantics
We don't want to use PyArray_IntpConverter because that:
- Does a heap allocation (we do a single contiguous allocation further down)
- Allows single integers as a shape, which should never have made it into a pickle file.

We could simplify things a bunch if we allowed np.zeros(2) to allocate an intermediate tuple and transform to np.zeros((2,)), rather than optimizing this case

seberg · 2019-05-03T05:12:11Z

Yeah, the min length semantics are weird (though I think you can hack around them), but then this code path is somewhat more restrictive as well, which may be good, so I do not mind.

Frankly, if we really wanted to micro-optimize, we should probably do an PyInt_Exact check first thing...

…dexArray There was some clumsy juggling of lengths in PyArray_IntpConverter where we ask for the length of the argument twice, and try to handle wrapping the object twice too. This fixes some bugs that probably no one ever will or did see: * A corrupt pickle file with an `shape` array that is too long would be silently truncated * An object that returns different `__len__`s each time it is asked will no longer fail * IntpFromSequence would overflow it's output buffer if nd == 0 This collapses `PyArray_IntpFromIndexSequence` back into `PyArray_IntpFromSequence`, since the former no longer has any callers (and is internal). Also update a test to fail with a more useful message.

Co-Authored-By: eric-wieser <[email protected]>

mattip · 2019-12-05T11:29:23Z

@eric-wieser this still has the WIP label

eric-wieser · 2019-12-05T11:36:19Z

I think I had some more cleanup to these functions locally - I'll see if I can revive it.

seberg · 2022-04-06T17:52:34Z

My guess is, that this one can be considered superseded by gh-20175 anyway. I did not remember this start.

eric-wieser requested a review from seberg April 26, 2019 08:04

eric-wieser force-pushed the PyArray_IntpFromIndexSequence-PySequenceFast branch from 9bc9cd3 to 9612c18 Compare April 26, 2019 08:05

eric-wieser commented Apr 26, 2019

View reviewed changes

eric-wieser mentioned this pull request Apr 26, 2019

MAINT: IntpFromSequence cannot return a wrong length here. #13405

Closed

eric-wieser force-pushed the PyArray_IntpFromIndexSequence-PySequenceFast branch from 9612c18 to 0b7b8d3 Compare April 26, 2019 08:15

tylerjereddy added the 03 - Maintenance label Apr 26, 2019

seberg reviewed Apr 30, 2019

View reviewed changes

charris added the component: numpy._core label May 1, 2019

eric-wieser and others added 2 commits June 3, 2019 22:24

Apply suggestions from code review

a43c29f

Co-Authored-By: eric-wieser <[email protected]>

eric-wieser force-pushed the PyArray_IntpFromIndexSequence-PySequenceFast branch from c76e027 to a43c29f Compare June 4, 2019 05:26

Base automatically changed from master to main March 4, 2021 02:04

charris added the 52 - Inactive Pending author response label Apr 6, 2022

charris closed this Apr 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

Uh oh!

eric-wieser commented Apr 26, 2019

Uh oh!

eric-wieser Apr 26, 2019 •

edited

Loading

Uh oh!

seberg left a comment

Uh oh!

Uh oh!

seberg Apr 30, 2019

Uh oh!

eric-wieser May 3, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eric-wieser commented May 3, 2019

Uh oh!

seberg commented May 3, 2019

Uh oh!

mattip commented Dec 5, 2019

Uh oh!

eric-wieser commented Dec 5, 2019

Uh oh!

seberg commented Apr 6, 2022

Uh oh!

Uh oh!

		@@ -974,7 +986,30 @@ PyArray_IntpFromIndexSequence(PyObject seq, npy_intp vals, npy_intp maxvals)
		NPY_NO_EXPORT int
		PyArray_IntpFromSequence(PyObject seq, npy_intp vals, int maxvals)

Uh oh!

MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

Uh oh!

Conversation

eric-wieser commented Apr 26, 2019

Uh oh!

eric-wieser Apr 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

seberg Apr 30, 2019

Choose a reason for hiding this comment

Uh oh!

eric-wieser May 3, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eric-wieser commented May 3, 2019

Uh oh!

seberg commented May 3, 2019

Uh oh!

mattip commented Dec 5, 2019

Uh oh!

eric-wieser commented Dec 5, 2019

Uh oh!

seberg commented Apr 6, 2022

Uh oh!

Uh oh!

eric-wieser Apr 26, 2019 •

edited

Loading