-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
API: Introduce copy argument for np.asarray [Array API]
#25168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
seberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you are looking for. If we need to add the copy kwarg to asarray, this is the right way.
The first interesting part is vetting the "no copy" logic (I don't remember if it is bad, but I definitely don't trust it without someone auditing it once more).
The other interesting part is places where copy=False already exists.
For np.array() that is likely fine (a brief search may be nice), some people will be using np.array(..., copy=False) or actually np.array(x, None, False) (or a dtype) 1.
The hope for those use-cases is that most of them will actually like the new meaning, and if they don't they will fail gracefully (it would be cute to mention copy=None and asarray() in the error message).
The trickier thing is probably arr.astype(original, copy=False). The problem is that this is very much very common code and such code will definitely expect the copy to be necessary often.
Also there, you can kust replace it with copy=None, but I am worried about the amount of churn we might create, since copy=False was always a best practice!
(It may be that while a best-practice in many places, it isn't nearly as common in end-user code, but I suspect it is fairly common.)
Footnotes
-
That may look crazy, but the the use of kwargs used to come with a large overhead and
asarraywas a Python wrapper using kwargs making it very slow for just returning the original object. ↩
f2421e2 to
1e023d3
Compare
1e023d3 to
92081fe
Compare
|
It looks like there are a number of places pandas uses |
|
You can use https://pandas-coverage-12d2130077bc.herokuapp.com to find which pandas tests hit a particular line of code in the pandas codebase. |
311af11 to
609f35f
Compare
|
It seems to me that there is a worm somewhere in all of this. What is So, is the meaning of |
|
@seberg To sum up:
For (Also I reflected these changes in masked arrays and legacy matrices, let me know if I should revert any of public API changes) I updated the whole codebase to adhere to these new meanings (if there are no caveats I can address it in downstream libraries).
Hmm I see that Array API defines
|
|
I am trying to avoid to think about it... I think it is strange to diverge the meaning, but breaking |
copy argument for np.asarraycopy argument for np.asarray
96ce46d to
a5785c7
Compare
9a4b7a2 to
92cb661
Compare
|
Hi @seberg, I updated the PR according to our discussion during the triage call and data-apis/array-api#721. The new semantics In the C-level code there is an updated definition of If it's Ok, I would update |
|
I do not see it as my decision to give the OK on changing how copy behaves in But, I do not like the change. The reason why I can live with the redefinition of |
|
In fact, we could just split out the |
d226c87 to
0ede769
Compare
copy argument for np.asarraycopy argument for np.asarray [Array API]
copy argument for np.asarray [Array API]copy argument for np.asarray [Array API]
0ede769 to
47da9e4
Compare
In the triage meeting we decided to add the kwarg in a try, catch a kwarg error, and emit a deprecation error if the user's |
@seberg The only relevant place that I found where Classes in tests/codebase that implement |
ee8993b to
3eba805
Compare
|
Ok, I updated docs so that |
3eba805 to
786d2d8
Compare
ngoldbaum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not following up quickly after you responded to my last review. I think there's still one refcounting issue. I also noticed a separate more minor issue in the docs but I think is worth addressing to reduce confusion from users.
@seberg if you have a chance, it would be nice if you could confirm you're ok with the semantics in this version of the PR just so I can make sure we're all on the same page.
|
I am happy with the addition. Note that it will be quite noisy unless you do not pass copy=None and take it this very slow. Code wise happy if you are. matching error exact is likely nice. Passing positionally for simplicity woukd be fine to me or using vectorcall would be best (much quicker probably and probably also simpler). |
786d2d8 to
222ea31
Compare
222ea31 to
beb523c
Compare
|
Merging this. Thanks @mtsokol! Also thanks for getting all the NEP 56 PRs across the line! 🎆 🥳 If you are reading this because this PR broke your nightly tests, take a look at the new docs, the NEP, and the scipy and scikit-learn PRs. If there is some sort of breakage in your code that is not straightforward to deal with in a backward compatible way, please open an issue. |
|
Thanks for the hard work on this! Still a few things to iron out I think, xref gh-25916. |
Tracking issue: #25076
Hi @seberg,
Here's a PR with an initial work on introducing
copykeyword tonp.asarray.I started looking at the internals that we discussed during the triage meeting, but first, I wanted to try with the simplest approach: The implementation of
np.asarrayis mostly the same asnp.array- it just skips a few parameters. One of them iscopywhich is hardcoded toFalseinnp.asarray.Here I added
copywith a default valueFalse, so it's a backward compatible change. Forcopy=Trueit behaves the same way asnp.array(..., copy=True). To sum up - thecopymechanism didn't change here, only thecopykeyword is added for compatibility.WDYT?