[MRG+1] fixed log_loss bug #6714

Harry040 · 2016-04-25T16:58:44Z

Reference Issue

metrics.log_loss fails when any classes are missing in y_true #4033
Fix a bug, the result is wrong when use sklearn.metrics.log_loss with one class, #4546
Log_loss is calculated incorrectly when only 1 class present #6703

What does this implement/fix? Explain your changes.

added labels option. when length of unique y_true < number of columns for y_score/y_pred, should use labels so as to len(unique(labels)) eques y_pred.shap[1].

Any other comments?

MechCoder · 2016-04-26T23:19:14Z

sklearn/metrics/classification.py

        predict_proba method.

+    labels : array-like 
+        When len(unique(y_true)) < len(unique(y_pred)), you must use labels option and


WDYT about rewording this as , "if labels is not provided, the labels are inferred from y_true or more specifically, set to np.unique(y_true)"

MechCoder · 2016-04-26T23:27:53Z

sklearn/metrics/tests/test_score_objects.py

    assert_almost_equal(score1, score2)


+def test_log_loss():


Please change the name of the test to something more specific, we are not testing log_loss but the correctness of the labels argument.

Harry040 · 2016-04-28T04:15:02Z

@MechCoder . I rewrite some points. May be there are still some points to change. Please tell ,I will change better.

TomDLT · 2016-04-28T10:51:12Z

sklearn/metrics/tests/test_classification.py

+    clf.fit(X, y)
+
+
+    y_score = clf.predict_proba([[2,2], [2,2]])


For the test you can probably hard code y_score = [[0, 1], [0, 1]].

Also, can you run flake8 on your code?

As said before, something in between 0 and 1, might be better, to avoid the clipping done below.

previously, I not use flake8. after I run flake8, there are some points to changes. Thanks.

MechCoder · 2016-04-29T17:42:54Z

sklearn/metrics/tests/test_classification.py

 from sklearn.metrics import zero_one_loss
 from sklearn.metrics import brier_score_loss
-
+from sklearn.tree import DecisionTreeClassifier


unused import

jnothman · 2016-06-21T03:04:18Z

What is the reason for closing, @hongguangguo ?

Harry040 · 2016-06-21T03:39:14Z

oh, I don't know what exact time to close the pull request. @jnothman

jnothman · 2016-06-21T04:02:09Z

oh, I don't know what exact time to close the pull request. @jnothman

As in, you think it should have been further reviewed or merged, but it's remained dormant? A better idea would be to say "could someone please review this?"

MechCoder · 2016-06-21T05:39:03Z

sklearn/metrics/tests/test_classification.py

+    assert_almost_equal(calculated_log_loss, ture_log_loss)
+
+
+


MechCoder · 2016-06-21T05:46:20Z

Just some minor cosmetic comments. +1 for merge.

MechCoder · 2016-07-09T22:46:56Z

@hongguangguo Would you have time to address the comments? It would be nice to get this in 0.18.

indianajensen · 2016-08-05T18:11:41Z

@MechCoder - looks like @hongguangguo might be on an extended "break" from coding, judging from other github activity. How do you feel about someone else making the changes you highlight? When would this have to be done by to get it in the code in time for v0.18?

nelson-liu · 2016-08-05T19:49:13Z

@indianajensen the beta is scheduled for mid-august and the final release is tentatively planned for first week of September. However, this doesn't have to be in by 0.18, it would be very nice to have it though. Due to @hongguangguo 's long period of inactivity even after being reminded, i think it's reasonable for you to cherry-pick the commits from this PR and try to complete it to merging quality.

indianajensen · 2016-08-07T17:45:59Z

thanks @nelson-liu. Certainly don't want to take any of the credit for @hongguangguo's great work here, but have put up a new PR as per your suggestion and then we can back that out later if it becomes redundant. Please have a look and see if you think I have missed anything.

MechCoder · 2016-08-24T17:51:16Z

Closing in favour of #7166

amueller · 2016-08-24T18:49:33Z

sklearn/metrics/classification.py

+
+    T = lb.transform(y_true)
+
    if T.shape[1] == 1:


So I'd say "if T.shape == 1 and len(labels) == 2". Is there a check that the shape of y_pred matches len(labels)?

fixes as per existing pull request scikit-learn#6714 fixed log_loss bug enhance log_loss labels option feature log_loss changed test log_loss case u add ValueError in log_loss fixes as per existing pull request scikit-learn#6714 fixed error message when y_pred and y_test labels don't match fixed error message when y_pred and y_test labels don't match corrected doc/whats_new.rst for syntax and with correct formatting of credits additional formatting fixes for doc/whats_new.rst fixed versionadded comment removed superfluous line removed superflous line

…ber of classes in y_true and y_pred differ Fixes #4033 , #4546 , #6703 * fixed log_loss bug enhance log_loss labels option feature log_loss changed test log_loss case u add ValueError in log_loss * fixed error message when y_pred and y_test labels don't match fixes as per existing pull request #6714 fixed log_loss bug enhance log_loss labels option feature log_loss changed test log_loss case u add ValueError in log_loss fixes as per existing pull request #6714 fixed error message when y_pred and y_test labels don't match fixed error message when y_pred and y_test labels don't match corrected doc/whats_new.rst for syntax and with correct formatting of credits additional formatting fixes for doc/whats_new.rst fixed versionadded comment removed superfluous line removed superflous line * Wrap up changes to fix log_loss bug and clean up log_loss fix a typo in whatsnew refactor conditional and move dtype check before np.clip general cleanup of log_loss remove dtype checks edit non-regression test and wordings fix non-regression test misc doc fixes / clarifications + final touches fix naming of y_score2 variable specify log loss is only valid for 2 labels or more

fixes as per existing pull request scikit-learn#6714 fixed log_loss bug enhance log_loss labels option feature log_loss changed test log_loss case u add ValueError in log_loss fixes as per existing pull request scikit-learn#6714 fixed error message when y_pred and y_test labels don't match fixed error message when y_pred and y_test labels don't match corrected doc/whats_new.rst for syntax and with correct formatting of credits additional formatting fixes for doc/whats_new.rst fixed versionadded comment removed superfluous line removed superflous line

fixed log_loss bug

dea9450

MechCoder reviewed Apr 26, 2016
View reviewed changes

MechCoder changed the title ~~fixed log_loss bug~~ [MRG] fixed log_loss bug Apr 26, 2016

MechCoder reviewed Apr 26, 2016
View reviewed changes

enhance log_loss labels option feature

3bbf527

MechCoder mentioned this pull request Apr 28, 2016

LabelBinarizer single label case #6723

Open

log_loss

772109b

TomDLT reviewed Apr 28, 2016
View reviewed changes

changed test log_loss case

f348f73

MechCoder reviewed Apr 29, 2016
View reviewed changes

Harry040 added 2 commits May 1, 2016 13:09

u

35eb853

add ValueError in log_loss

4efc122

Harry040 closed this Jun 21, 2016

MechCoder reopened this Jun 21, 2016

MechCoder reviewed Jun 21, 2016
View reviewed changes

sklearn/metrics/tests/test_classification.py

assert_almost_equal(calculated_log_loss, ture_log_loss)

Copy link

Member

MechCoder Jun 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line

MechCoder changed the title ~~[MRG] fixed log_loss bug~~ [MRG+1] fixed log_loss bug Jun 21, 2016

MechCoder added this to the 0.18 milestone Jul 9, 2016

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 7, 2016

fixes as per existing pull request scikit-learn#6714

6c5f0ca

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016

fixes as per existing pull request scikit-learn#6714

48a4811

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016

fixes as per existing pull request scikit-learn#6714

f20e385

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016

fixes as per existing pull request scikit-learn#6714

2f01a31

indianajensen mentioned this pull request Aug 17, 2016

[MRG+1] Log loss bug fixed #7166

Closed

MechCoder closed this Aug 24, 2016

amueller reviewed Aug 24, 2016
View reviewed changes

nelson-liu mentioned this pull request Aug 24, 2016

[MRG+2] Fix log loss bug #7239

Merged

Uh oh!

[MRG+1] fixed log_loss bug #6714

[MRG+1] fixed log_loss bug #6714

Uh oh!

Conversation

Harry040 commented Apr 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Harry040 commented Apr 28, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jun 21, 2016

Uh oh!

Harry040 commented Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Jun 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MechCoder commented Jun 21, 2016

Uh oh!

MechCoder commented Jul 9, 2016

Uh oh!

indianajensen commented Aug 5, 2016

Uh oh!

nelson-liu commented Aug 5, 2016

Uh oh!

indianajensen commented Aug 7, 2016

Uh oh!

MechCoder commented Aug 24, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Harry040 commented Apr 25, 2016 •

edited

Loading

Harry040 commented Jun 21, 2016 •

edited

Loading