Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[WIP] 'most_frequent' drop method for OneHotEncoder #18679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

trewaite
Copy link
Contributor

Reference Issues/PRs

#18553

What does this implement/fix? Explain your changes.

Added most frequent option to drop argument of OneHotEncoder. Added attribute categories_count_ to accomplish this. If feature categories count all equal, drops first category.

Any other comments?

Issue also asked for a dropped_levels_ attribute, Let me know if this is something you would like added.

@trewaite
Copy link
Contributor Author

trewaite commented Nov 4, 2020

@jnothman let me know what you think! I've never added a new attribute before and not sure protocol around documenting, as this seems to be why some checks have failed.

@thomasjpfan
Copy link
Member

For reference, there is concurrent work on getting counts in the encoders here: #16018

@jnothman
Copy link
Member

Sorry for the slow attention here @trewaite. Would the changes to _encode.py proposed in #16018 be sufficient for the functionality here, or do you need additional changes to _encode.py?

I'm not yet sure where that error has come from.

@trewaite
Copy link
Contributor Author

trewaite commented Jan 4, 2021

No problem

Sorry for the slow attention here @trewaite. Would the changes to _encode.py proposed in #16018 be sufficient for the functionality here, or do you need additional changes to _encode.py?

I'm not yet sure where that error has come from.

No problem! Yes changes #16018 will be sufficent for implementing this functionality.

Should I wait until #16018 is merged to master and continue implementation from there? What would best practice be?

Base automatically changed from master to main January 22, 2021 10:53
@ogrisel
Copy link
Member

ogrisel commented Sep 9, 2022

#16018 was merged.

@B-Noumedem
Copy link

Hello any news about this feature? It is quite useful for GLM models.

@muhlbach
Copy link

muhlbach commented Dec 9, 2024

What is the status on this PR? Would very much like this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants