[MRG] ENH Add get_feature_names for OneHotEncoder #6441

yenchenlin · 2016-02-24T05:08:56Z

This is a PR for #6425 .
I've added get_feature_names to OneHotEncoder.
Can @jnothman please have a look at this?

jnothman · 2016-02-24T05:10:06Z

sklearn/preprocessing/data.py

+        feature_names = []
+        for (i, n_value) in enumerate(self.n_values_):
+            for j in xrange(n_value):
+                feature_names.append(input_features[i])


I think you want something like "{}={}".format(name, value)

Sorry can you elaborate more?

"{}={}".format(name, value)

What is name and value here?

.format(input_features[i], j) rather

Oh you mean adding j into feature_names to make it more clear, then the output will become something like

['x0 0', 'x0 1', 'x1 0', 'x1 1', 'x1 2', 'x2 0', 'x2 1', 'x2 2', 'x2 3']

Am I wrong?

I'd rather the = in there...

Yeah I agree that the following output is better:

['x0=0', 'x0=1', 'x1=0', 'x1=1', 'x1=2', 'x2=0', 'x2=1', 'x2=2', 'x2=3']

I've updated the code.
Please have a look.
Thanks!

jnothman · 2016-02-24T10:04:15Z

sklearn/preprocessing/data.py

+        else:
+            if len(input_features) != len(self.n_values_):
+                raise ValueError("Number of input_features must equal to "
+                                 "n_feature. it has to be of shape "


What is n_feature?

Oh it should be n_features,
like n_features in this line:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/preprocessing/data.py#L1710

I've updated the code.
Thanks!

jnothman · 2016-02-24T12:11:05Z

sklearn/preprocessing/data.py

+            input_features = ['x%d' % i for i in range(len(self.n_values_))]
+        else:
+            if len(input_features) != len(self.n_values_):
+                raise ValueError("Number of input_features must equal to "


This is clunky still. How about Length of input_features is {0} but it must equal number of features when fitted: {1}.?

Yeah, and showing len(self.n_values_) in error message may be more informative too.

Code updated.

yenchenlin · 2016-02-28T06:23:05Z

Hello @jnothman ,
Is there any problem in this?

jnothman · 2016-02-28T09:56:31Z

LGTM

jnothman · 2016-02-28T09:57:00Z

but it may be subject to an embargo :p

yenchenlin · 2016-02-29T15:36:53Z

Oh yeah ...
However, I think this function is really useful for OneHotEncoder since it makes OneHotEncoder's output become more clear than before.

MechCoder · 2016-02-29T22:14:59Z

LGTM as well.

MechCoder · 2016-02-29T23:11:40Z

Actually, I take back my +1. n_values_ returns the maximum categorical value of every feature and not the number of categories.

data = [[1, 100], [10, 200]]
enc = OneHotEncoder(handle_unknown="error")
enc.fit(data)
enc.n_values_
[11, 201]

This should probably wait for #5270 and the unique_samples_ attribute in that PR

jnothman · 2016-03-01T06:18:11Z

oh, i forgot about that... Withholding my +1.

amueller · 2016-10-08T03:12:13Z

I think this should wait for the refactoring of OneHotEncoder for accepting strings in #7327

jorisvandenbossche · 2018-09-24T14:57:07Z

This has been added in #10198 in the meantime. So closing this, but @yenchenlin thanks for working on it anyway!

jnothman reviewed Feb 24, 2016
View reviewed changes

yenchenlin changed the title ~~[MRG] ENH Add get_feature_names for OneHotEncoder~~ [WIP] ENH Add get_feature_names for OneHotEncoder Feb 24, 2016

yenchenlin force-pushed the add-get_feature_names-for-onehotencoder branch 2 times, most recently from f4b4d5b to acb09d2 Compare February 24, 2016 06:48

jnothman reviewed Feb 24, 2016
View reviewed changes

yenchenlin force-pushed the add-get_feature_names-for-onehotencoder branch from acb09d2 to 7aa1754 Compare February 24, 2016 10:48

jnothman reviewed Feb 24, 2016
View reviewed changes

yenchenlin force-pushed the add-get_feature_names-for-onehotencoder branch 2 times, most recently from 61e1331 to ac47fab Compare February 24, 2016 13:31

jnothman mentioned this pull request Feb 24, 2016

Transformative get_feature_names for various transformers #6425

Closed

11 tasks

yenchenlin changed the title ~~[WIP] ENH Add get_feature_names for OneHotEncoder~~ [MRG] ENH Add get_feature_names for OneHotEncoder Feb 25, 2016

../preprocessing/tests/test_data.py

15a4a75

yenchenlin force-pushed the add-get_feature_names-for-onehotencoder branch from ac47fab to 15a4a75 Compare February 27, 2016 14:40

jnothman changed the title ~~[MRG] ENH Add get_feature_names for OneHotEncoder~~ [MRG+1] ENH Add get_feature_names for OneHotEncoder Feb 28, 2016

MechCoder changed the title ~~[MRG+1] ENH Add get_feature_names for OneHotEncoder~~ [MRG+2] ENH Add get_feature_names for OneHotEncoder Feb 29, 2016

jnothman changed the title ~~[MRG+2] ENH Add get_feature_names for OneHotEncoder~~ [MRG] ENH Add get_feature_names for OneHotEncoder Mar 1, 2016

eyadsibai mentioned this pull request Jun 7, 2018

Add get_feature_names() method scikit-learn-contrib/category_encoders#79

Closed

jorisvandenbossche closed this Sep 24, 2018

Uh oh!

[MRG] ENH Add get_feature_names for OneHotEncoder #6441

[MRG] ENH Add get_feature_names for OneHotEncoder #6441

Uh oh!

Conversation

yenchenlin commented Feb 24, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yenchenlin commented Feb 28, 2016

Uh oh!

jnothman commented Feb 28, 2016

Uh oh!

jnothman commented Feb 28, 2016

Uh oh!

yenchenlin commented Feb 29, 2016

Uh oh!

MechCoder commented Feb 29, 2016

Uh oh!

MechCoder commented Feb 29, 2016

Uh oh!

jnothman commented Mar 1, 2016

Uh oh!

amueller commented Oct 8, 2016

Uh oh!

jorisvandenbossche commented Sep 24, 2018

Uh oh!

Uh oh!