-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
Description
Recently updated from numpy 1.11.0 to 1.14.0 and some of my code was immediately broken.
You can simulate the broken bits through the following
my_metadata_dtype = np.dtype({'names':['Thing1','Thing2'],
'offsets':[22, 40],
'formats':['>u2','>i4'],
'itemsize'=256})
list_of_arrays = [np.arange(i, i+128, dtype='>i2') for i in range(5)]
array_of_metadata = np.array(list_of_arrays, dtype=my_metadata_dtype)
other_array_of_metadata = np.array(list_of_arrays)
other_array_of_metadata.dtype = my_metadata_dtypeOne would expect that array_of_metadata and other_array_of_metadata should be the same thing.
They are not. I submit that the line creating array_of_metadata is buggy. I believe this produced the same as the lines that produce other_array_of_metadata in previous versions.
But that wasn't even the primary bug I was going to talk about. I discovered that one while hunting down this one.
Take our other_array_of_metadata:
Let's make a new array from a subset of this array (not just a view)
sub_array = np.array(other_array_of_metadata[2:4])
sub_array.dtype = '>i2'You will see that our sub_array now has strange things in the spaces that were not exposed by our dtype. Not the nice aranges we had constructed before. In 1.11, this code would have preserved the data that was not exposed. Now it appears that the unexposed portions of sub-array are never initialized to anything. This matters because sometimes we want to only expose a part of our metadata to our code, but when we write the metadata back out to disk, want all of it preserved.
I can understand making a design choice to not initialize the unexposed portions of a dtype when constructing new elements from literals or from scratch. But when you already have an element of this dtype, ignoring the empty space seems like an inferior choice.