Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pipeline can not be used with multiclass (or the other way around) #1019

@amueller

Description

@amueller

tldr: Is it good to check if a classifier supports decision_function using hasattr?

The plan use the following in the multi-class to get a decision function:

try:
    clf.predict_proba()
except (AttributeError, NotImplementedError):
   clf.decision_function()

Optionally, put the decision_function and predict_proba in place on the fly as in GridSearchCV.

Hey everybody.
I just noticed that pipeline can not be used with the multiclass module in some cases.
The multiclass module checks whether a classifier has a decision_function and tries to use that, otherwise it uses predict_proba. A Pipeline by default has the whole zoo of methods and tries to call them on the last estimator.
So if the last estimator has no decision_function, but a predict_proba (like the trees), this fails :-/

The main reason this annoys me is because I wanted to use the same decision_function / predict_proba pattern for auc cross-validation (see #1014), so this also fails with pipelines.

Two ways to fix this:

  1. set the methods of pipeline on the fly and only if the last estimator has them. It is done this way in GridSearchCV.

  2. Instead of using hasattr(clf, 'decision_function'), we could use a try ... except AttributeError block.

From a pretty-coding standpoint I'd prefer 2). But the AttributeError might be caused by something else and shadow some other bug. If there was an NoDecisionFunctionError, I'd go for that. But we don't want custom exceptions, do we?

I'm glad i didn't start counting the API annoyances that I found this weekend :-/

Metadata

Metadata

Assignees

No one assigned

    Labels

    EasyWell-defined and straightforward way to resolveEnhancement

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions