Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: Add the classification models threshold as parameter of __init__ method #20635

Closed
@defoishugo

Description

@defoishugo

Describe the workflow you want to enable

Let's take a probability-based regression model like sklearn.linear_model.LogisticRegression.
We have two methods to get the predictions:

  • predict_proba() outputs a probability for each class
  • predict() outputs a class (the one with the most important probability)

I would like to activate the threshold specification scenario so the user can decide the threshold. This scenario is of course not the default one when using LogisticRegression, but may be useful for anomaly/opportunity detection.

Real life example:

A user created a LogisticRegression model in order to detect trading opportunities that highly lead to benefits.
There are two possible class: 1 for "opportunity", 0 for "no opportunity".
The user would like to get only best probabilities predictions (> 90%) to create a position on the market.

Describe your proposed solution

My proposed solution is to add an optionnal parameter threshold (default: 1/NB_CLASSES) to the predict() method.
We could also edit other probability regression models if needed.

Additional context

I would like to work on this feature if possible.

I know this is possible to do this kind of things as a post-processing step on the user side.
However, if we provide a predict() method that choose a class, I think we should provide a way to define the threshold to the user.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions