[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9816

ghost · 2017-09-21T13:31:36Z

Reference Issue

amueller

looks good to me.

amueller · 2017-09-21T16:05:52Z

sklearn/preprocessing/label.py

        diff = np.setdiff1d(y, np.arange(len(self.classes_)))
-        if diff:
-            raise ValueError("y contains new labels: %s" % str(diff))
+        if len(diff):


This len is the fix, right?

Correct - that's all that was needed to produce a sensible error 😄

amueller · 2017-09-21T16:06:37Z

sklearn/preprocessing/tests/test_label.py

-    assert_raises(ValueError, le.inverse_transform, [-1])
+    le.fit([1, 2, 3, -1, 1])
+    msg = "contains previously unseen labels"
+    assert_raise_message(ValueError, msg, le.inverse_transform, [-2])


I find the organization of the tests a bit weird but not your fault. The test that it actually works if they are present is way at the top of the file.

Happy to reorganise tomorrow if you are able to give me some pointers - I'm not very familiar with the testing structure of sklearn as this is my first issue.

it's fine, I think.

lesteve · 2017-09-21T17:21:38Z

LGTM, merging, thanks a lot @newey01c!

jnothman · 2017-09-24T23:32:01Z

This is missing a whats_new entry. I'll pull it into my 0.19.1 branch and write an entry there

… on unseen labels (scikit-learn#9816)

vdaita · 2018-02-03T22:09:15Z

The issue appears to be persistent - I am using LabelEncoder. Here is my stack trace:

 File "ann.py", line 71, in <module>
    X_train, X_test, y_train, y_test = get_dataset("Churn_Modelling.csv", 3, 13, 13)
  File "ann.py", line 28, in get_dataset
    encoder.fit(labels)
  File "/home/yolopc/.local/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 96, in fit
    self.classes_ = np.unique(y)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 210, in unique
    return _unique1d(ar, return_index, return_inverse, return_counts)
  File "/home/yolopc/.local/lib/python3.5/site-packages/numpy/lib/arraysetops.py", line 277, in _unique1d
    ar.sort()
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Do you have any suggestions?

jnothman · 2018-02-03T23:41:09Z

As noted in #10552, this was accidentally not included in the 0.19.1 release. Using the development version of scikit-learn will make it work.

vdaita · 2018-02-04T16:54:56Z

I tried to follow the instructions on the website, but I could not import any scikit-learn libraries. I also ran pip install -e, but I got the following error: IndexError: list index out of range. Do you have any pointers? I am using Python 2.7 on Ubuntu 16.04.

On Sat, Feb 3, 2018 at 6:41 PM Joel Nothman ***@***.***> wrote: As noted in #10552 <#10552>, this was accidentally not included in the 0.19.1 release. Using the development version of scikit-learn will make it work. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9816 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV4hUWvS5UzS3P1532AQtZOjXe8bnh6Sks5tRO68gaJpZM4PfUrv> .

-- Vijay

jnothman · 2018-02-04T21:59:33Z

not sure without a full traceback what index was out of range. you can try pip install https://g <https://hith> ithub.com/scikit-learn/scikit-learn/archive/master.zip to install the latest development version.

…

vdaita · 2018-02-05T20:17:08Z

Thank you - I found an alternate solution.

On Sun, Feb 4, 2018 at 5:00 PM Joel Nothman ***@***.***> wrote: not sure without a full traceback what index was out of range. you can try pip install https://g <https://hith> ithub.com/scikit-learn/scikit-learn/archive/master.zip to install the latest development version. > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9816 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AV4hUXCZMFcf1ARIl9XGIRyNj2IdsGZDks5tRihmgaJpZM4PfUrv> .

-- Vijay

Fix incorrect ValueError, add extra test case

397a8a2

ghost mentioned this pull request Sep 21, 2017

Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9813

Closed

Wrap error message to <79 chars

effbd45

amueller reviewed Sep 21, 2017

View reviewed changes

amueller changed the title ~~Fix ValueError in LabelEncoder when using inverse_transform on unseen labels~~ [MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels Sep 21, 2017

lesteve merged commit c554aad into scikit-learn:master Sep 21, 2017

jnothman added this to the 0.19.1 milestone Sep 24, 2017

ghost deleted the patch_9812 branch September 25, 2017 08:01

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform…

9985b80

… on unseen labels (scikit-learn#9816)

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform…

35f2151

… on unseen labels (scikit-learn#9816)

ghost mentioned this pull request Dec 27, 2017

LabelEncoder: ValueError: The truth value of an array with more than one element is ambiguous #9812

Closed

jnothman mentioned this pull request Jan 10, 2018

Numpy deprecation warning: sklearn/preprocessing/label.py:151 #10449

Closed

svadali16 mentioned this pull request Jan 30, 2018

Inverse Transform for Label Encoder fails when more than one new values are present. #10552

Closed

qinhanmin2014 mentioned this pull request Jan 31, 2018

DOC move labelencoder what's new from 0.19 to 0.20 #10556

Merged

ayashjorden mentioned this pull request Mar 10, 2018

empty parsed text RasaHQ/rasa#846

Closed

ianozsvald mentioned this pull request Aug 12, 2018

Conflict with PermutationImportance, DataFrame and XGBoost (with workaround) TeamHG-Memex/eli5#256

Closed

Uh oh!

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9816

[MRG + 1] Fix ValueError in LabelEncoder when using inverse_transform on unseen labels #9816

Uh oh!

Conversation

ghost commented Sep 21, 2017

Reference Issue

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

amueller Sep 21, 2017

Choose a reason for hiding this comment

Uh oh!

ghost Sep 21, 2017

Choose a reason for hiding this comment

Uh oh!

amueller Sep 21, 2017

Choose a reason for hiding this comment

Uh oh!

ghost Sep 21, 2017

Choose a reason for hiding this comment

Uh oh!

amueller Sep 21, 2017

Choose a reason for hiding this comment

Uh oh!

lesteve commented Sep 21, 2017

Uh oh!

jnothman commented Sep 24, 2017

Uh oh!

vdaita commented Feb 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Feb 3, 2018

Uh oh!

vdaita commented Feb 4, 2018 via email

Uh oh!

jnothman commented Feb 4, 2018 via email

Uh oh!

vdaita commented Feb 5, 2018 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vdaita commented Feb 3, 2018 •

edited

Loading