-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Handle NaNs in OneHotEncoder #16749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…into handlemissing-onehotencoder
…ing of 'indicator', 'all-zero'
c8e29b3
to
8ab4ec8
Compare
How is this going, @nilichen? |
I managed to get fit_transform and inv_transform working for general cases, but got stuck in deciding how My personal view is that this function has become a bit too complicated and it might be more straightforward that users deal with NaN/None in pandas as usual. Otherwise, #13028 might be a better option to move forward. |
Thanks for sharing your experience with it @nilichen . Long term maintainability of these options is certainly a concern, particularly if we want to add other encoders (that support them) in the future. |
Reference Issues/PRs
Towards #11996. Fixed #12025. See also #13028 and #15009.
What does this implement/fix? Explain your changes.
Tackle
handle_missing
specifically for NaNs.handle_missing
can beTests implemented for all three options, including
inverse_transform
Pending: documentation
Suggestion: can utilize
pd.isna
to tackle both NaNs and None