Thanks to visit codestin.com
Credit goes to github.com

Skip to content

API: Remove ptp, itemset and newbyteorder from np.ndarray class #24682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 20, 2023

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Sep 11, 2023

Hi @rgommers @ngoldbaum,

This PR contains the first batch of cleanup changes for numpy.ndarray class described in NEP 52 section. It covers removal of ptp, setitem and newbyteorder methods from ndarray. Should I add deprecation message or are these too niche to mention that?

Please note that I left np.matrix and np.ma.MaskedArray classes intact as they are already legacy. (masked array has its own implementation and matrix class uses np.ptp)

@mtsokol mtsokol self-assigned this Sep 11, 2023
@mtsokol mtsokol force-pushed the ndarray-class-cleanup-part-1 branch from 06a34b2 to 80cbd85 Compare September 11, 2023 16:46
@mtsokol mtsokol force-pushed the ndarray-class-cleanup-part-1 branch from 80cbd85 to 65bb8dd Compare September 11, 2023 16:51
@ngoldbaum
Copy link
Member

Should I add deprecation message or are these too niche to mention that?

Searching on github, there seems to be a lot of usage of newbyteorder to get a byteswapped version of an array. Like here in pandas.

ptp also seems to have a decent amount of downstream usage, e.g. here in matplotlib.

usage of setitem is harder to evaluate because the name isn't really unique to numpy in python code, but usages in popular downstream libraries don't immediately pop out in my search.

From that, I think we definitely at least need an AttributeError that tells people what the migration is when people access the removed attributes. I'm also a tiny bit nervous that the migration we need to suggest for newbyteorder might get complicated, judging by the documentation update. The most common usage seems to be doing array.byteswap().newbyteorder() to get a byteswapped copy. Is there a copy-pasteable idiom we can suggest instead that doesn't depend on details on the current dtype?

@mtsokol
Copy link
Member Author

mtsokol commented Sep 11, 2023

I'm also a tiny bit nervous that the migration we need to suggest for newbyteorder might get complicated, judging by the documentation update. The most common usage seems to be doing array.byteswap().newbyteorder() to get a byteswapped copy. Is there a copy-pasteable idiom we can suggest instead that doesn't depend on details on the current dtype?

As a replacement I'm using:

arr.view(arr.dtype.newbytorder(new_order))

as np.ndarray.newbyteorder points out that they are equivalent.

@seberg seberg added the triage review Issue/PR to be discussed at the next triage meeting label Sep 11, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the ndarray.itemset type annotations still persist as of the moment and will have to be removed as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out! I removed them.

@mtsokol mtsokol force-pushed the ndarray-class-cleanup-part-1 branch from ca8425f to 1b24963 Compare September 12, 2023 11:08
@ngoldbaum
Copy link
Member

Let me know if you need a hand setting up the AttributeError in C (or just a deprecation if we decide an outright error is too disruptive).

@mtsokol mtsokol force-pushed the ndarray-class-cleanup-part-1 branch from c217775 to 0d4887d Compare September 13, 2023 10:16
@mtsokol
Copy link
Member Author

mtsokol commented Sep 13, 2023

Let me know if you need a hand setting up the AttributeError in C (or just a deprecation if we decide an outright error is too disruptive).

@ngoldbaum Right, I forgot to add them. I pushed new changes - now in ndarray and scalar types, when accessing each of these functions, a custom AttributeError is raised.

ksunden added a commit to ksunden/matplotlib that referenced this pull request Sep 13, 2023
xref numpy/numpy#24682

Only a handful of lines, and _mostly_ in tests/relatively unused utility functions like rgb_to_hsv, but still something that should be addressed.
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like matplotlib is ready. Are there any other downstream libraries we need to wait on PRs for?

Just one minor nit on the error message otherwise I think this is ready so long as downstream is ready as far as we're aware.

array_ptp(PyArrayObject *self, void *NPY_UNUSED(ignored))
{
PyErr_SetString(PyExc_AttributeError,
"`ptp` has been removed from ndarray class. "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, here and in the other messages I'd say "was removed from the ndarray class in NumPy 2.0".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I updated them.

@mtsokol
Copy link
Member Author

mtsokol commented Sep 17, 2023

It looks like matplotlib is ready. Are there any other downstream libraries we need to wait on PRs for?

I checked other libraries and opened four PRs that reflect changes introduced there:

Then we're good to go.

@eendebakpt
Copy link
Contributor

Both ptp and itemset I have used quite a lot. The removal of ptp is ok, it is a bit slower, but not too much:

import numpy as np
import numpy.core.multiarray as mu
x=np.random.rand(20,)
%timeit np.ptp(x)
%timeit x.ptp()
%timeit mu.maximum.reduce(x) - mu.minimum.reduce(x) # more verbose, but a fast alternative

results in:

5.2 µs ± 16.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.36 µs ± 181 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
2.69 µs ± 14.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

The itemset is a bit strange:

import numpy as np
x = np.random.rand(10,)
r = np.random.rand(3, 3)

%timeit x.itemset(2, 3)
%timeit x[2] = 3
%timeit r.itemset(2, 3)
%timeit r[2] = 3
%timeit r.itemset(0, 1, 3)
%timeit r[0, 1] = 3

results in

120 ns ± 3.04 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
98.9 ns ± 0.14 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
168 ns ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
440 ns ± 15.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
143 ns ± 4.72 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
121 ns ± 5.59 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

So the normal indexing is even faster than a direct itemset, even though the documentation for itemset claims "Compared to indexing syntax, itemset provides some speed increase ..." (in older versions of numpy/python this was true). The only exception is for indexing a 2D array with a single index. But that case might be rare.

@mtsokol The title of the PR contains setitem, maybe change that into itemset?

@mtsokol mtsokol changed the title API: Remove ptp, setitem and newbyteorder from np.ndarray class API: Remove ptp, itemset and newbyteorder from np.ndarray class Sep 17, 2023
@mtsokol
Copy link
Member Author

mtsokol commented Sep 17, 2023

@eendebakpt Ah yes, I did a typo - fixed.

@ngoldbaum
Copy link
Member

All the downstream PRs are merged so let's pull this one in. Thanks @mtsokol. As always, if you're reading this because of CI breakage and you think we made a mistake in this PR and the migrations we've offered aren't sufficient, please open an issue.

@ngoldbaum ngoldbaum merged commit 07db5be into numpy:main Sep 20, 2023
@mtsokol mtsokol deleted the ndarray-class-cleanup-part-1 branch September 20, 2023 15:09
@jakirkham
Copy link
Contributor

Were there deprecation warnings added to NumPy 1 for these removals somewhere?

@ngoldbaum
Copy link
Member

Were there deprecation warnings added to NumPy 1 for these removals somewhere?

Given that it's a major version bump we decided to do a few API cleanups without any deprecation, including these. If that proves to be too disruptive, we're open to re-adding with deprecation warnings.

@jakirkham
Copy link
Contributor

We just ran into it when testing between NumPy 1 & 2

I don't think we need to readd them as there is a path to using the functions instead of the methods

Though it might be worth adding a deprecation warning in the next NumPy 1 release just so users know this is coming

@jakirkham
Copy link
Contributor

cc @grlee77 (who ran into this; in case you have more thoughts here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
30 - API Numpy 2.0 API Changes triage review Issue/PR to be discussed at the next triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants