Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Handle subarrays in descr_to_dtype #13433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 12, 2019
Merged

Conversation

mattip
Copy link
Member

@mattip mattip commented Apr 30, 2019

Fixes #13431.

There are alternative spellings of dtype=[('c', '<f8', (2, 5))], handle the dtype=[('c', ('<f8', (5,)), (2,))] variant.

@seberg seberg added this to the 1.16.4 release milestone Apr 30, 2019
@seberg seberg self-assigned this Apr 30, 2019
@charris
Copy link
Member

charris commented Apr 30, 2019

The test failure was the matmul heisenbug. Restarted test.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange beast, nested subfields... But they do seem to work fine for all things (including being resolved at arbitrary depth when there are no fields left).

Anyway, LGTM, will merge soon.


This function reverses the process, eliminating the empty padding fields.
'''
if isinstance(descr, (str, dict)):
if isinstance(descr, (str, dict, tuple)):
# No padding removal needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this is a subarray of structured types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s = np.dtype([('a', np.int8), ('b', np.int16), ('c', np.int32)], align=True)
s_sub = np.dtype((s, (3,)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to recurse for subarray types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an interesting case. Top level subarrays are degenerated on arrays (they are added to the dimensions of the array), cannot quickly find a way to create an array with such a dtype, but it somewhat feels like there may have been strange ways to do it.

Copy link
Member

@seberg seberg May 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s = np.dtype([('a', np.int8), ('b', np.int16), ('c', np.int32)], align=True)
s_sub = np.dtype((s, (1,1)))
arr = np.zeros(3, s_sub)
print(arr.shape, arr.dtype)
arr = np.ndarray(shape=3, buffer=arr, dtype=s_sub)
print(arr.shape, arr.dtype)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, watch out for structured types like (int, [('fields', int)]) which have a non-void base

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this one is still broken (although maybe the original issue is solved and this is just another issue). Had a too shallow look at this probably, though :/.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the subarray to be at the top level to hit this code-path - nest it inside a structured one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the subarray to be at the top level to hit this code-path - nest it inside a structured one.

@seberg seberg self-requested a review May 1, 2019 16:54
Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to unstick my pending comment...

@eric-wieser
Copy link
Member

eric-wieser commented May 2, 2019

Here's the case that this handles incorrectly:

dt = np.dtype([
    ('x', np.dtype((
        np.dtype((
            np.dtype({'names':['a','b'], 'formats':['i1','i1'], 'offsets':[0,4], 'itemsize':8}),
            (3,)
        )),
        (4,)
    )))
])
assert descr_to_dtype(dt.descr) == dt

@mattip
Copy link
Member Author

mattip commented May 2, 2019

I think we should fail to create a dtype with (int, [('fields', int)]). It does not seem to fit into any of the categories of dtypes we should parse.

In any case, its descr attribute does not provide the information to reconstruct it, so lib.format.dtype_to_descr fails. If needed, let's handle that in a different issue/PR

@eric-wieser
Copy link
Member

eric-wieser commented May 2, 2019

Agreed that the non-void struct is not important. We should still support arbitrarily nested subarrays though.

@mattip
Copy link
Member Author

mattip commented May 2, 2019

I think the last commit fixed parsing nested subarrays, at least the tests with the new dtypes pass.

np.dtype([('x', ([('a', '|i1'),
('', '|V3'),
('b', '|i1'),
('', '|V3'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you inserting these empty fields? The point of my example was that your code fails when there is unnamed padding here (fails by creating new fields, which this function's purpose is to avoid)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To hit the failing path, you need to use np.dtype({'names':['a','b'], 'formats':['i1','i1'], 'offsets':[0,4], 'itemsize':8}) as the inner type here, not [('a', '|i1'), ('', '|V3'), ('b', '|i1'), ('', '|V3')]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that passes too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full type I use in a comment above still fails

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a complete example of a dtype that fails to roundtrip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test added and fixed

'offsets':[0,4],
'itemsize':8,
},
(3,)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think dtype(dict, tuple) is legal, which will cause an error during test collection

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are passing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> np.dtype(int, "this argument is ignored")
dtype('int32')

This test is ignoring the (3,) silently, which is a different bug.

@mattip
Copy link
Member Author

mattip commented May 5, 2019

I don't think the dict path can ever be hit

It seems not, removing

@charris charris changed the title BUG: handle subarrays in descr_to_dtype BUG: Handle subarrays in descr_to_dtype May 11, 2019
@seberg
Copy link
Member

seberg commented May 11, 2019

This stuff still confuses me a bit, but it does seem the test should cover the interesting corner cases, so can probably merge.

np.dtype((
np.dtype((
np.dtype([
('a', int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
('a', int)
('a', int),
('b', np.dtype({'names':['a','b'],
'formats':['i1','i1'],
'offsets':[0,4],
'itemsize':8})),

Finally, this is what will make things fail...

# subtype, will always have a shape descr[1]
dt = descr_to_dtype(descr[0])
return numpy.dtype((dt, descr[1]))
return numpy.dtype(descr)
Copy link
Member

@seberg seberg May 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return numpy.dtype(descr)
return np.dtype(descr_to_dtype(descr[0]), descr[1])

Is that assert correct here, since it is not a list around it, there cannot be a field name, so it must have two entries, right? (should probably not leave the assert, or doe sit get stripped on install?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost looks correct, but this should be np.dtype((descr_to_dtype(descr[0]), descr[1]))

@charris
Copy link
Member

charris commented May 12, 2019

close/reopen

@charris charris closed this May 12, 2019
@charris charris reopened this May 12, 2019
@seberg
Copy link
Member

seberg commented May 12, 2019

Ok, putting this in then. What I am not quite sure is whether there is some issue that should be opened here, may come back to it, but it will be a fringe issue in any case, I suppose.

@seberg seberg merged commit e6227a0 into numpy:master May 12, 2019
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label May 14, 2019
@charris charris removed this from the 1.16.4 release milestone May 14, 2019
@mattip mattip deleted the issue13431 branch June 8, 2020 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

np.load() "invalid shape in fixed-type tuple" in NumPy 1.16.0
4 participants