-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Support for strings in OneHotEncoder #8793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
stephen-hoover
wants to merge
36
commits into
scikit-learn:master
from
stephen-hoover:ohe-with-strings
Closed
Changes from all commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
ea98484
Refactored OneHotEncoder to work with strings
vighneshbirodkar e03b5c7
ported functions to fixes.py
vighneshbirodkar 06e6d3a
unique arrays are now sorted
vighneshbirodkar 074f194
revert selection logic
vighneshbirodkar 083142e
Added copy argument
vighneshbirodkar f768f3b
Inbetween adding the seen option
vighneshbirodkar 1e34cae
remove seen argument and support range case with FutureWarning
vighneshbirodkar fed7959
Made label_encoders_ private
vighneshbirodkar c62d2ba
Added new attributes and tests for OHE
vighneshbirodkar e929f23
Fixed doctests
vighneshbirodkar bc7a26b
Fixed rst doc tests
vighneshbirodkar feaf014
Replaced type in array with ellipsis
vighneshbirodkar 7b608e1
flake fixes
vighneshbirodkar 5f305d8
Add NORMALIZE_WHITESPACE for python3 tests
vighneshbirodkar 1392292
normalize whitespace for rst docs
vighneshbirodkar 50d2360
normalizing whitespace again
vighneshbirodkar 8f2f1d3
docstring changes and minor optimizations
vighneshbirodkar 1c8accf
Made tests pass by creating arrays with object dtype
vighneshbirodkar 6edda8b
Assign both values and n_values to self._values and remove redundant …
vighneshbirodkar 1d2ca1a
removed extra spaces for flake8 compat
vighneshbirodkar 93ae49e
REF Refactor OHE and avoid copies
fd11366
WIP
b96a8d2
Remove error-strict, add auto-strict
7902352
Fixes for test failures
4206d79
ENH Handle object and string types in LabelEncoder.transform
d96fbc6
Fix tests
0807604
Fix for doc test and scipy 0.11 sparse behavior
b6d198a
ENH Enforce dtypes in _apply_selected
7db5ced
TST More tests for OneHotEncoder
ac9e455
DOC Add What's new and polish docstring for OHE
2525019
Deprecate active_features_
7a53fe8
Switch from auto-strict to error-strict
05af448
Deprecate integer and list of integer inputs to `values`
d9d77ae
Address CR
840382e
Fix whitespace in doc test
ff4b30b
Fix doctest for Python 2.7
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would split this between enhancements (the new stuff) and api changes (deprecations).