Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support for ordinal multi-classification #23324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lorenzwalthert opened this issue May 10, 2022 · 9 comments
Open

Support for ordinal multi-classification #23324

lorenzwalthert opened this issue May 10, 2022 · 9 comments
Labels
Needs Decision - Include Feature Requires decision regarding including feature New Feature

Comments

@lorenzwalthert
Copy link

lorenzwalthert commented May 10, 2022

Describe the workflow you want to enable

Encode response variable ordering with every scikit learn classifier according to the method introduced in this frequently cited paper.

Describe your proposed solution

Implement OrdinalClassifier, a classifier that takes another scikit learn classifier as input and encodes the ordinality assumption into the classifier. It's quite simple, and there are also blog posts on it, for example by @m46f here. Here's a slightly more advanced version that can also handle arbitrary labels that I'd be willing to contribute as a PR: https://gist.github.com/lorenzwalthert/51371894225c7b530b66bdabfad60327

A lot of classification problems are inherently ordinal in nature and this trick has proven to work well in various settings, the code is very little to add to the project, adds zero dependencies and is a much cited approach (600+).

Describe alternatives you've considered, if relevant

Using that code outside of scikit learn, but I think ti's generally useful to others.

Additional context

No response

@lorenzwalthert lorenzwalthert added Needs Triage Issue requires triage New Feature labels May 10, 2022
@lorenzwalthert lorenzwalthert changed the title Support Ordinal Classification Support for ordinal classification May 10, 2022
@glemaitre
Copy link
Member

How does it compare with NOCATS such as started in this pull request: #12866

@lorenzwalthert
Copy link
Author

From my understanding of NOCATS, the differences are:

  • NOCATS approaches handle categorical variables in the predictor, ordinal classification adresses the response variable.
  • NOCATS try to get rid of the ordinality of features, or allows it to be unknown, while ordinal classification assumes the user knows the order. In more detail: There are two classical ways to treat categorical features. WIth dummy-encoding, the split is one group vs rest (which requires a lot of nodes). With integer-encoding, an order of the categories is assumed at the splitpoint (because all observations below the cut-off are sent left, the rest sent right). The NOCATS approach outlined in https://github.com/jblackburne uses random splits to generate the grouping at the split (and hopes to find useful ones).
  • NOCATS and similar efforts have a bigger scope in the sense that they seem to be relatively big changes in terms of line count, require compiled code etc.
  • NOCATS implementations like NOCATS: Categorical splits for tree-based learners (ctnd.) #12866 are specific for one (base) learner like trees, while the learner proposed in this issue can turn any classifier into a classifier that respects the order of the y values. In that sense, the scope of this isuse is bigger than NOCATS.

@glemaitre
Copy link
Member

I completely missed that it was referring to encoding the target and not the data. So this is not related to NOCATS.

@glemaitre
Copy link
Member

So this is something that could be adequate for the sklearn.multiclass module in addition to OvO and OvR classifier.

@lorenzwalthert lorenzwalthert changed the title Support for ordinal classification Support for ordinal multi-classification May 11, 2022
@MuhammadAgf
Copy link

Hello
Thanks for mentioning me I'm the writer of the article mentioned above.
I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

this article by Christopher Coffee might help deal with the problem mentioned above
https://medium.com/@deathswithbenefits/heres-the-actual-recipe-ec9aa0915d40

@thomasjpfan thomasjpfan added Needs Decision - Include Feature Requires decision regarding including feature and removed Needs Triage Issue requires triage labels May 11, 2022
@lorenzwalthert
Copy link
Author

lorenzwalthert commented May 20, 2022

I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

Ok. But I don’t think that’s a big problem. We can just normalize the probability, no?

Also. here is some other related work:
https://github.com/leeprevost/OrdinalClassifier

@MuhammadAgf
Copy link

I think it's important to be aware that there is a major flaws in the implementation that I wrote in that article: since each classifier are independent from each other, the method won’t give the true probability (it won’t sum up to 1)

Ok. But I don’t think that’s a big problem. We can just normalize the probability, no?

Also. here is some other related work: https://github.com/leeprevost/OrdinalClassifier

I've checked the link that you share, I think it has a better implementation than the one that I provide in the medium post

@dkimpara
Copy link

dkimpara commented Feb 3, 2023

Is there still interest in including this feature?

@koaning
Copy link

koaning commented Jan 29, 2024

Figured I'd mention it, @FBruzzesi made an implementation of a OrdinalClassifier that's a meta estimator in scikit-lego. Somewhat experimental feature (most of the library is that) but could also be relevant to folks reading this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Decision - Include Feature Requires decision regarding including feature New Feature
Projects
None yet
Development

No branches or pull requests

6 participants