Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] Sorting ordering option in OrdinalEncoder #14984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 32 commits into from

Conversation

venkyyuvy
Copy link
Contributor

Reference Issues/PRs

Fixes #14954

What does this implement/fix?

Added the option categories='lexicographic' and added corresponding warning

Any other comments?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for the warning functionality

@venkyyuvy
Copy link
Contributor Author

venkyyuvy commented Sep 22, 2019

Please add a test for the warning functionality

Added. Thanks

@jnothman
Copy link
Member

Please change the title to describe the change. The PR title will become the commit message once merged.

@venkyyuvy venkyyuvy changed the title add_lexico [MRG] adding lexicographic ordering option Sep 23, 2019
@venkyyuvy
Copy link
Contributor Author

Please change the title to describe the change. The PR title will become the commit message once merged.

oh... Thanks. Edited.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if categories='sort' would be better than categories='lexicographic'. I also suspect that this deserves an update to doc/modules/preprocessing.rst.

@venkyyuvy
Copy link
Contributor Author

venkyyuvy commented Oct 7, 2019

I wonder if categories='sort' would be better than categories='lexicographic'. I also suspect that this deserves an update to doc/modules/preprocessing.rst.

Apologies for delayed response. I have incorporated both the suggestions. Kindly let me know your feedback.

Unfortunately, the string 'sort' creates some issues. I am currently going back to the string 'lexicographic'.

@ogrisel
Copy link
Member

ogrisel commented Oct 7, 2019

Unfortunately, the string 'sort' creates some issues. I am currently going back to the string 'lexicographic'.

What issues? I don't see any valid reason as to why using "sort" as an option name would not work.

@venkyyuvy
Copy link
Contributor Author

venkyyuvy commented Oct 8, 2019

Unfortunately, the string 'sort' creates some issues. I am currently going back to the string 'lexicographic'.

What issues? I don't see any valid reason as to why using "sort" as an option name would not work.

Sorry, My bad. It was not working
(specifically the line if self.categories not in ['auto', 'sort']: was returning true even when self.categories='sort') in this commit and I was unable to figure out the reason.

Now, it is working fine. Thanks for your inputs.

@venkyyuvy venkyyuvy changed the title [MRG] adding lexicographic ordering option [MRG] Sorting ordering option in OrdinalEncoder Oct 8, 2019
@venkyyuvy venkyyuvy requested review from jnothman and ogrisel October 16, 2019 03:27
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had intended to review today anyway. Thanks!

@glemaitre
Copy link
Member

I think that #14984, #15050, and #15396 might not be blockers for 0.22 and I would move them for 0.23.

I think that it could be great to have a single issue (superseded #14953, #14954) to discuss the overall behaviour for categories in OneHotEncoder and OrdinalEncoder and from there having several PRs which follows the discussed proposals.

@agramfort
Copy link
Member

FYI I wrote for students a Count based OrdinalEncoder recently:

https://gist.github.com/agramfort/4873f16d78fde33f0caa482febf08211

maybe it helps

@jnothman jnothman modified the milestones: 0.22, 0.23 Dec 5, 2019
@jnothman
Copy link
Member

jnothman commented Dec 7, 2019 via email

@venkyyuvy
Copy link
Contributor Author

venkyyuvy commented Dec 8, 2019

you will also need a what's new entry

Could you please help me with type of entry, which I need to mention in what's new entry?
Is it Enhancement or Fix?

@jnothman
Copy link
Member

jnothman commented Dec 8, 2019

It's an API entry: we are providing a new way to access existing functionality

@adrinjalali
Copy link
Member

removing from the milestone.

@adrinjalali adrinjalali removed this from the 0.23 milestone Apr 22, 2020
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking pretty good... but maybe it's lacking consensus on whether it's the right way forward.

Base automatically changed from master to main January 22, 2021 10:51
@cmarmo cmarmo added Needs Decision Requires decision and removed Waiting for Reviewer labels Mar 23, 2022
@adrinjalali
Copy link
Member

A fresh PR might be a better way to go here.

@adrinjalali adrinjalali closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OrdinalEncoder: Deprecate automatically assuming lexicographic ordering
9 participants