Support for ordinal multi-classification #23324

lorenzwalthert · 2022-05-10T16:10:24Z

Describe the workflow you want to enable

Encode response variable ordering with every scikit learn classifier according to the method introduced in this frequently cited paper.

Describe your proposed solution

Implement OrdinalClassifier, a classifier that takes another scikit learn classifier as input and encodes the ordinality assumption into the classifier. It's quite simple, and there are also blog posts on it, for example by @m46f here. Here's a slightly more advanced version that can also handle arbitrary labels that I'd be willing to contribute as a PR: https://gist.github.com/lorenzwalthert/51371894225c7b530b66bdabfad60327

A lot of classification problems are inherently ordinal in nature and this trick has proven to work well in various settings, the code is very little to add to the project, adds zero dependencies and is a much cited approach (600+).

Describe alternatives you've considered, if relevant

Using that code outside of scikit learn, but I think ti's generally useful to others.

Additional context

No response

The text was updated successfully, but these errors were encountered:

glemaitre · 2022-05-10T21:02:45Z

How does it compare with NOCATS such as started in this pull request: #12866

lorenzwalthert · 2022-05-11T08:35:30Z

From my understanding of NOCATS, the differences are:

NOCATS approaches handle categorical variables in the predictor, ordinal classification adresses the response variable.
NOCATS try to get rid of the ordinality of features, or allows it to be unknown, while ordinal classification assumes the user knows the order. In more detail: There are two classical ways to treat categorical features. WIth dummy-encoding, the split is one group vs rest (which requires a lot of nodes). With integer-encoding, an order of the categories is assumed at the splitpoint (because all observations below the cut-off are sent left, the rest sent right). The NOCATS approach outlined in https://github.com/jblackburne uses random splits to generate the grouping at the split (and hopes to find useful ones).
NOCATS and similar efforts have a bigger scope in the sense that they seem to be relatively big changes in terms of line count, require compiled code etc.
NOCATS implementations like NOCATS: Categorical splits for tree-based learners (ctnd.) #12866 are specific for one (base) learner like trees, while the learner proposed in this issue can turn any classifier into a classifier that respects the order of the y values. In that sense, the scope of this isuse is bigger than NOCATS.

glemaitre · 2022-05-11T09:39:17Z

I completely missed that it was referring to encoding the target and not the data. So this is not related to NOCATS.

glemaitre · 2022-05-11T09:47:57Z

So this is something that could be adequate for the sklearn.multiclass module in addition to OvO and OvR classifier.

MuhammadAgf · 2022-05-11T13:46:28Z

Hello
Thanks for mentioning me I'm the writer of the article mentioned above.
I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

this article by Christopher Coffee might help deal with the problem mentioned above
https://medium.com/@deathswithbenefits/heres-the-actual-recipe-ec9aa0915d40

lorenzwalthert · 2022-05-20T10:42:37Z

I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

Ok. But I don’t think that’s a big problem. We can just normalize the probability, no?

Also. here is some other related work:
https://github.com/leeprevost/OrdinalClassifier

MuhammadAgf · 2022-12-01T13:39:46Z

I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

Ok. But I don’t think that’s a big problem. We can just normalize the probability, no?

Also. here is some other related work: https://github.com/leeprevost/OrdinalClassifier

I've checked the link that you share, I think it has a better implementation than the one that I provide in the medium post

dkimpara · 2023-02-03T18:10:02Z

Is there still interest in including this feature?

koaning · 2024-01-29T20:12:54Z

Figured I'd mention it, @FBruzzesi made an implementation of a OrdinalClassifier that's a meta estimator in scikit-lego. Somewhat experimental feature (most of the library is that) but could also be relevant to folks reading this.

lorenzwalthert added Needs Triage Issue requires triage New Feature labels May 10, 2022

lorenzwalthert changed the title ~~Support Ordinal Classification~~ Support for ordinal classification May 10, 2022

lorenzwalthert changed the title ~~Support for ordinal classification~~ Support for ordinal multi-classification May 11, 2022

thomasjpfan added Needs Decision - Include Feature Requires decision regarding including feature and removed Needs Triage Issue requires triage labels May 11, 2022

FBruzzesi mentioned this issue Jan 10, 2024

[FEATURE] Meta Ordinal Classification koaning/scikit-lego#607

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for ordinal multi-classification #23324

Support for ordinal multi-classification #23324

lorenzwalthert commented May 10, 2022 •

edited

Loading

glemaitre commented May 10, 2022

Uh oh!

lorenzwalthert commented May 11, 2022

Uh oh!

glemaitre commented May 11, 2022

Uh oh!

glemaitre commented May 11, 2022

Uh oh!

MuhammadAgf commented May 11, 2022

Uh oh!

lorenzwalthert commented May 20, 2022 •

edited

Loading

Uh oh!

MuhammadAgf commented Dec 1, 2022

Uh oh!

dkimpara commented Feb 3, 2023

Uh oh!

koaning commented Jan 29, 2024 •

edited

Loading

Uh oh!

Uh oh!

Support for ordinal multi-classification #23324

Support for ordinal multi-classification #23324

Comments

lorenzwalthert commented May 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

glemaitre commented May 10, 2022

Uh oh!

lorenzwalthert commented May 11, 2022

Uh oh!

glemaitre commented May 11, 2022

Uh oh!

glemaitre commented May 11, 2022

Uh oh!

MuhammadAgf commented May 11, 2022

Uh oh!

lorenzwalthert commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MuhammadAgf commented Dec 1, 2022

Uh oh!

dkimpara commented Feb 3, 2023

Uh oh!

koaning commented Jan 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorenzwalthert commented May 10, 2022 •

edited

Loading

lorenzwalthert commented May 20, 2022 •

edited

Loading

koaning commented Jan 29, 2024 •

edited

Loading