Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Harry040
Copy link
Contributor

@Harry040 Harry040 commented Apr 25, 2016

Reference Issue

metrics.log_loss fails when any classes are missing in y_true #4033
Fix a bug, the result is wrong when use sklearn.metrics.log_loss with one class, #4546
Log_loss is calculated incorrectly when only 1 class present #6703

What does this implement/fix? Explain your changes.

added labels option. when length of unique y_true < number of columns for y_score/y_pred, should use labels so as to len(unique(labels)) eques y_pred.shap[1].

Any other comments?

predict_proba method.
labels : array-like
When len(unique(y_true)) < len(unique(y_pred)), you must use labels option and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about rewording this as , "if labels is not provided, the labels are inferred from y_true or more specifically, set to np.unique(y_true)"

@MechCoder MechCoder changed the title fixed log_loss bug [MRG] fixed log_loss bug Apr 26, 2016
assert_almost_equal(score1, score2)


def test_log_loss():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the name of the test to something more specific, we are not testing log_loss but the correctness of the labels argument.

@Harry040
Copy link
Contributor Author

@MechCoder . I rewrite some points. May be there are still some points to change. Please tell ,I will change better.

clf.fit(X, y)


y_score = clf.predict_proba([[2,2], [2,2]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the test you can probably hard code y_score = [[0, 1], [0, 1]].

Also, can you run flake8 on your code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said before, something in between 0 and 1, might be better, to avoid the clipping done below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously, I not use flake8. after I run flake8, there are some points to changes. Thanks.

from sklearn.metrics import zero_one_loss
from sklearn.metrics import brier_score_loss

from sklearn.tree import DecisionTreeClassifier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused import

@Harry040 Harry040 closed this Jun 21, 2016
@jnothman
Copy link
Member

What is the reason for closing, @hongguangguo ?

@MechCoder MechCoder reopened this Jun 21, 2016
@Harry040
Copy link
Contributor Author

Harry040 commented Jun 21, 2016

oh, I don't know what exact time to close the pull request. @jnothman

@jnothman
Copy link
Member

oh, I don't know what exact time to close the pull request. @jnothman

As in, you think it should have been further reviewed or merged, but it's remained dormant? A better idea would be to say "could someone please review this?"

assert_almost_equal(calculated_log_loss, ture_log_loss)



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line

@MechCoder
Copy link
Member

Just some minor cosmetic comments. +1 for merge.

@MechCoder MechCoder changed the title [MRG] fixed log_loss bug [MRG+1] fixed log_loss bug Jun 21, 2016
@MechCoder MechCoder added this to the 0.18 milestone Jul 9, 2016
@MechCoder
Copy link
Member

@hongguangguo Would you have time to address the comments? It would be nice to get this in 0.18.

@indianajensen
Copy link
Contributor

@MechCoder - looks like @hongguangguo might be on an extended "break" from coding, judging from other github activity. How do you feel about someone else making the changes you highlight? When would this have to be done by to get it in the code in time for v0.18?

@nelson-liu
Copy link
Contributor

@indianajensen the beta is scheduled for mid-august and the final release is tentatively planned for first week of September. However, this doesn't have to be in by 0.18, it would be very nice to have it though. Due to @hongguangguo 's long period of inactivity even after being reminded, i think it's reasonable for you to cherry-pick the commits from this PR and try to complete it to merging quality.

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 7, 2016
@indianajensen
Copy link
Contributor

thanks @nelson-liu. Certainly don't want to take any of the credit for @hongguangguo's great work here, but have put up a new PR as per your suggestion and then we can back that out later if it becomes redundant. Please have a look and see if you think I have missed anything.

indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016
indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016
indianajensen added a commit to indianajensen/scikit-learn that referenced this pull request Aug 9, 2016
@MechCoder
Copy link
Member

Closing in favour of #7166

@MechCoder MechCoder closed this Aug 24, 2016

T = lb.transform(y_true)

if T.shape[1] == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'd say "if T.shape == 1 and len(labels) == 2". Is there a check that the shape of y_pred matches len(labels)?

nelson-liu pushed a commit to nelson-liu/scikit-learn that referenced this pull request Aug 24, 2016
fixes as per existing pull request scikit-learn#6714

fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

fixes as per existing pull request scikit-learn#6714

fixed error message when y_pred and y_test labels don't match

fixed error message when y_pred and y_test labels don't match

corrected doc/whats_new.rst for syntax and with correct formatting of credits

additional formatting fixes for doc/whats_new.rst

fixed versionadded comment

removed superfluous line

removed superflous line
nelson-liu pushed a commit to nelson-liu/scikit-learn that referenced this pull request Aug 24, 2016
fixes as per existing pull request scikit-learn#6714

fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

fixes as per existing pull request scikit-learn#6714

fixed error message when y_pred and y_test labels don't match

fixed error message when y_pred and y_test labels don't match

corrected doc/whats_new.rst for syntax and with correct formatting of credits

additional formatting fixes for doc/whats_new.rst

fixed versionadded comment

removed superfluous line

removed superflous line
nelson-liu pushed a commit to nelson-liu/scikit-learn that referenced this pull request Aug 25, 2016
fixes as per existing pull request scikit-learn#6714

fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

fixes as per existing pull request scikit-learn#6714

fixed error message when y_pred and y_test labels don't match

fixed error message when y_pred and y_test labels don't match

corrected doc/whats_new.rst for syntax and with correct formatting of credits

additional formatting fixes for doc/whats_new.rst

fixed versionadded comment

removed superfluous line

removed superflous line
MechCoder added a commit that referenced this pull request Aug 25, 2016
…ber of classes in y_true and y_pred differ

Fixes #4033 , #4546 , #6703

* fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

* fixed error message when y_pred and y_test labels don't match

fixes as per existing pull request #6714

fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

fixes as per existing pull request #6714

fixed error message when y_pred and y_test labels don't match

fixed error message when y_pred and y_test labels don't match

corrected doc/whats_new.rst for syntax and with correct formatting of credits

additional formatting fixes for doc/whats_new.rst

fixed versionadded comment

removed superfluous line

removed superflous line

* Wrap up changes to fix log_loss bug and clean up log_loss

fix a typo in whatsnew

refactor conditional and move dtype check before np.clip

general cleanup of log_loss

remove dtype checks

edit non-regression test and wordings

fix non-regression test

misc doc fixes / clarifications + final touches

fix naming of y_score2 variable

specify log loss is only valid for 2 labels or more
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016
fixes as per existing pull request scikit-learn#6714

fixed log_loss bug

enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss

fixes as per existing pull request scikit-learn#6714

fixed error message when y_pred and y_test labels don't match

fixed error message when y_pred and y_test labels don't match

corrected doc/whats_new.rst for syntax and with correct formatting of credits

additional formatting fixes for doc/whats_new.rst

fixed versionadded comment

removed superfluous line

removed superflous line
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants