Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MAINT: Refactor PyArray_IntpFromIndexSequence into PyArray_IntpFromIndexArray #13407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

eric-wieser
Copy link
Member

There was some clumsy juggling of lengths in PyArray_IntpConverter where we ask for the length of the argument twice, and try to handle wrapping the object twice too.

This fixes some bugs that probably no one ever will or did see:

  • A corrupt pickle file with an shape array that is too long would be silently truncated
  • An object that returns different __len__s each time it is asked will no longer fail
  • IntpFromSequence would overflow it's output buffer if nd == 0

This collapses PyArray_IntpFromIndexSequence back into PyArray_IntpFromSequence, since the former no longer has any callers (and is internal)

@eric-wieser eric-wieser requested a review from seberg April 26, 2019 08:04
@eric-wieser eric-wieser force-pushed the PyArray_IntpFromIndexSequence-PySequenceFast branch from 9bc9cd3 to 9612c18 Compare April 26, 2019 08:05
/*
* After the deprecation the PyNumber_Check could be replaced
* by PyIndex_Check.
* FIXME 1.9 ?
Copy link
Member Author

@eric-wieser eric-wieser Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already handled within PyArray_IntpFromIndexSequence on line 131 - all this did was forbid objects that implement __index__ but neither __int__ nor __float__, and try to produce better error messages. The former was a waste of time, and the latter is now handled by catching TypeError and replacing the message on line 147

The deprecation this refers to was eliminated a long time ago.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't find anything to nitpick (should read the non-diff code once more at some point).

The only thing I find strange is that chunk of code in methods.c, but maybe I am missing things. Should we just deprecate that PyArray_MIN while we are at it (I do not care much either way)? The multiarray.c check would be unnecessary in the future then.

@@ -974,7 +986,30 @@ PyArray_IntpFromIndexSequence(PyObject *seq, npy_intp *vals, npy_intp maxvals)
NPY_NO_EXPORT int
PyArray_IntpFromSequence(PyObject *seq, npy_intp *vals, int maxvals)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should deprecate that maxvals thing. Well, not providing it, but using the minimum. If maxvals is smaller, it should really be an error and not just ignore things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, we'd just deprecate this whole function in favor of the other one.

@eric-wieser
Copy link
Member Author

The only thing I find strange is that chunk of code in methods.c, but maybe I am missing things

There are two things going on there:

  • We don't want to use PyArray_IntpFromSequence there because it has the bizarre min-length semantics
  • We don't want to use PyArray_IntpConverter because that:
    • Does a heap allocation (we do a single contiguous allocation further down)
    • Allows single integers as a shape, which should never have made it into a pickle file.

We could simplify things a bunch if we allowed np.zeros(2) to allocate an intermediate tuple and transform to np.zeros((2,)), rather than optimizing this case

@seberg
Copy link
Member

seberg commented May 3, 2019

Yeah, the min length semantics are weird (though I think you can hack around them), but then this code path is somewhat more restrictive as well, which may be good, so I do not mind.

Frankly, if we really wanted to micro-optimize, we should probably do an PyInt_Exact check first thing...

eric-wieser and others added 2 commits June 3, 2019 22:24
…dexArray

There was some clumsy juggling of lengths in PyArray_IntpConverter where we ask for the length of the argument twice, and try to handle wrapping the object twice too.

This fixes some bugs that probably no one ever will or did see:
* A corrupt pickle file with an `shape` array that is too long would be silently truncated
* An object that returns different `__len__`s each time it is asked will no longer fail
* IntpFromSequence would overflow it's output buffer if nd == 0

This collapses `PyArray_IntpFromIndexSequence` back into `PyArray_IntpFromSequence`, since the former no longer has any callers (and is internal).

Also update a test to fail with a more useful message.
@eric-wieser eric-wieser force-pushed the PyArray_IntpFromIndexSequence-PySequenceFast branch from c76e027 to a43c29f Compare June 4, 2019 05:26
@mattip
Copy link
Member

mattip commented Dec 5, 2019

@eric-wieser this still has the WIP label

@eric-wieser
Copy link
Member Author

I think I had some more cleanup to these functions locally - I'll see if I can revive it.

Base automatically changed from master to main March 4, 2021 02:04
@charris charris added the 52 - Inactive Pending author response label Apr 6, 2022
@charris charris closed this Apr 6, 2022
@seberg
Copy link
Member

seberg commented Apr 6, 2022

My guess is, that this one can be considered superseded by gh-20175 anyway. I did not remember this start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants