Description
Code Sample, a copy-pastable example if possible
mix = pd.MultiIndex.from_product([['foo', 'bar'], ['one', 'two'], ['A', 'B']])
mix.isin([('foo', 'one', 'B'), ('bar', 'two', 'A')]) # returns `[False, True, False, False, False, False, True, False]`
mix.isin([('bar',), ('foo', 'one', 'B'), ('bar',)]) # should raise, returns [False, True, False, False, False, False, False, False] instead
Problem description
I got bit this morning by some improperly structured arguments for MultiIndex.isin
. It turns out that as long as one value in the input values
is the right length, isin
doesn't care if every other value is too short. On the other hand, isin
will error out in many other cases relating to the lengths of elements of input
. With respect to my above example, these all raise an error:
mix.isin([('foo', 'one'), ('bar', 'two')]) # raises `ValueError: Length of names must match number of levels in MultiIndex.`
mix.isin([('foo', 'one', 'B'), ('bar', 'two', 'A', 'alpha')]) # raises `ValueError: Length of names ...`
mix.isin([('bar',), ('foo', 'one', 'B'), ('bar', 'two', 'A', 'alpha')]) # raises `ValueError: Length of names ...`
but this hot mess executes normally:
mix.isin([('foo', 'one', 'B'), *(('bar',),)*33]) # returns [False, True, False, False, False, False, False, False]
I'd like to see two fixes:
-
If elements of
values
are too short,isin
should raise an error, just like it does in cases when elements are too long. -
The current error message for invalid value length is a little confusing, since it refers to
Length of names
, even thoughnames
isn't an argument forisin
. It turns out the reason for this is that the error is raised by the function_set_names
during the construction of a newMultiIndex
thatisin
uses to do validation.
I'll submit a PR