-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
ENH: improved dtype inference for Index.map #44609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
looks good. is this a user visible thing? if so can you add a whatsnew |
An example to illustrate, assume we have an uint32 index:
On master we always get an int64 index back, while with this PR it tries to preserve the original dtype:
Now, I don't know if this is actually a user visible change compared to the released version, since the above example is with a non-int64 dtype, which is only in master (so in that sense you could see it as a bug fix in the new not-yet-released feature) |
Ah, we of course already had UInt64Index, for example, and that will now also preserve the dtype instead of returning Int64Index in the example above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current tests you edited are (all?) simple lambda x: x
checks that ensure the dtype is now preseved. Can you ensure we also test the case where a maybe_cast_pointwise_result
needs to fallback? Eg idx.map(lambda x: x*1000)
with an uint8 index, where the resulting values thus don't fit in the original dtype.
yes, will add whatsnew. |
I do intend to do this (https://github.com/pandas-dev/pandas/pull/44609/files#diff-9090cca4ac914e526dfb58dd84ce7c75e2935e41cdeb3ee1618bb5068f05e2dbR549), the question is when. This PR in its current state is just splitting the Index.map changes off from #43930. I'm happy to take the time to do this The Right Way, but that does mean marginally slowing down the ExtensionIndex PR. |
Broken off from #43930