-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
NEP: NEP 55 revision - dedicated scalar type #28842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Will |
Do you mean support for missing data? That's a good question. First thing that comes to my mind is "it doesn't need to"? In [1]: arr = np.array(["hello", "world", np.nan], dtype=np.dtypes.StringDType(na_object=np.nan))
In [2]: arr[1]
Out[2]: np.vstr('world')
In [3]: arr[2]
Out[3]: nan And empty scalar can be represented by an empty string: In [1]: np.vstr()
Out[1]: np.vstr('') |
Hmm, for static typing that would be kinda problematic. For example if you have some The |
I think this might need a little bit more careful thought. It might even need its own (smaller than NEP-55) NEP. In particular, this update doesn't engage with @seberg's main criticism of your implementation PR: you're not proposing a IMO this would be a lot stronger if you could make it so If you don't think
The problem for Maybe the |
|
Haha that's exactly what I've done in numpy/numtype#335, and it's on my "port-to-numpy" TODO list. update: #28856 |
Issue: #28165
Hi!
As requested in the comment, this PR updates NEP 55 to describe
StringDType
scalar feature.Let's first decide on the right approach for implementing it. The initial idea consists in having a scalar type which owns UTF-8 encoded string without interacting with
StringDType
allocation mechanisms. The type participates in NumPy's type hierarchy and fills the gaps in typing capabilities incurred by "Pythonstr
as scalar" approach.