Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pd.MultiIndex.isin doesn't validate input correctly #26622

Open
@telamonian

Description

@telamonian

Code Sample, a copy-pastable example if possible

mix = pd.MultiIndex.from_product([['foo', 'bar'], ['one', 'two'], ['A', 'B']])

mix.isin([('foo', 'one', 'B'), ('bar', 'two', 'A')]) # returns `[False, True, False, False, False, False, True, False]`

mix.isin([('bar',), ('foo', 'one', 'B'), ('bar',)]) # should raise, returns [False, True, False, False, False, False, False, False] instead

Problem description

I got bit this morning by some improperly structured arguments for MultiIndex.isin. It turns out that as long as one value in the input values is the right length, isin doesn't care if every other value is too short. On the other hand, isin will error out in many other cases relating to the lengths of elements of input. With respect to my above example, these all raise an error:

mix.isin([('foo', 'one'), ('bar', 'two')]) # raises `ValueError: Length of names must match number of levels in MultiIndex.`

mix.isin([('foo', 'one', 'B'), ('bar', 'two', 'A', 'alpha')]) # raises `ValueError: Length of names ...`

mix.isin([('bar',), ('foo', 'one', 'B'), ('bar', 'two', 'A', 'alpha')]) # raises `ValueError: Length of names ...`

but this hot mess executes normally:

mix.isin([('foo', 'one', 'B'), *(('bar',),)*33]) # returns [False,  True, False, False, False, False, False, False]

I'd like to see two fixes:

  • If elements of values are too short, isin should raise an error, just like it does in cases when elements are too long.

  • The current error message for invalid value length is a little confusing, since it refers to Length of names, even though names isn't an argument for isin. It turns out the reason for this is that the error is raised by the function _set_names during the construction of a new MultiIndex that isin uses to do validation.

I'll submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions