-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: ensure find-like ufuncs convert arguments to common dtypes #26198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ae4e820
to
793fff5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good in principle, but I wonder if a better approach would not be to use the promoter - it sort-of feels that is made for dealing with these kind of dtype mismatches. That said, that would perhaps remove failures for mismatching dtype that you would like to preserve. So, perhaps this is the simplest way, at least for now.
793fff5
to
c5520b3
Compare
The problem with using the promoter is I'm not handed dtype instances, only the dtypemetas. That means in Maybe this is a use case for |
If you just add a promoter you should be getting a new |
Sorry if I was unclear, I was talking about distinct Right now on the
This PR avoids the error by casting the needle string to It's not clear to me how to fix this using a promoter only, because the error path that's generating the |
I suppose you don't have a "cannot be a NULL object" (since that can be used with another one in a binary operation just fine)? But in that case yes, you need to either:
Although, I think it would make sense to see if the str->string cast cannot return a StringDType instance which works here. |
Yes, I wonder if one cannot have a (possibly internal only) |
OK, fair enough. I'm going to come back to this with a different approach. |
This cleans up how
np.strings
functions handle multiple string arguments.Currently, some functions call
np.asanyarray
and some don't. This makes it so all functions in this namespace that take multiply string arguments do that.This came up working with pandas, where I want to use
np.dtype.StringDType(coerce=False, na_object=pd.NA)
as the "default" numpy string dtype. If I don't make these changes, then I have to coerce all the arguments to manuallynp.strings
functions to the same commonStringDType
instance. This is particularly annoying when dealing with python strings as arguments, since those get coerced to the defaultStringDType
by the promoters, which then leads to an error when trying to use a ufunc with two non-equalStringDType
instances.IMO it will be easier for everyone if numpy just deals with this issue in the
np.strings
wrappers. You can also bypass this coercion by explicitly passing ndarray arguments due to the use of e.g.dtype=getattr(arg, "dtype", a.dtype)
in all the wrappers.There are also a couple docstring cleanups I noticed.
May require #26147 to be merged for the tests to all pass.