-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
array(...) casts mix lists of int and string to string automatically (excepted dtype object) #6550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think the same bug is behind the weird behavior of assert_array_equal here below
|
I agree with you than object dtype would be preferable, but this is also pretty long standing behavior in numpy. |
The bug is only with strings and float ? Or on any mixed type data ? On Thursday, October 22, 2015, Stephan Hoyer [email protected]
|
This smells like it's related to gh-6061 (which has an unreviewed PR sitting around since July: gh-6067). Merging gh-6067 directly may not fix this because I'm not sure that |
@sdementen: Though note that there is also an independent plan to change |
I felt upon this bug while using xlwings (interface with Excel). It was On Thursday, October 22, 2015, Nathaniel J. Smith [email protected]
|
This one has tripped me up recently - putting tuples in HDF5 attributes via h5py converts them to numpy arrays, and I put a tuple of an int and a string in and got back two strings later. Most of the casting numpy does is to types that pass equality checks, but since |
I dont this that this is a bug. we can find the same behavior in R. both upcast dtype to the minimal type required to handle data. object dtype seems extreme to me. for example numpy uses it when we do not have a matrix structure that is array elements did not have the same dimemsion ex: np.[1,2,3],[1,2]]) |
The following code explains the issue
I would have expected that not specifying a dtype would have given the result as if I had specify a dtype(object) (to be able to hold both ints and strings)
as the doc http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html describes
dtype : data-type, optional
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method.
But I may have missed some point in the docs
The text was updated successfully, but these errors were encountered: