Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@indianajensen
Copy link
Contributor

@indianajensen indianajensen commented Aug 9, 2016

all credit for this goes to @hongguangguo (#6714)

Reference Issue
metrics.log_loss fails when any classes are missing in y_true #4033
Fix a bug, the result is wrong when use sklearn.metrics.log_loss with one class, #4546
Log_loss is calculated incorrectly when only 1 class present #6703

What does this implement/fix? Explain your changes.
added labels option. when length of unique y_true < number of columns for y_score/y_pred, should use labels so as to len(unique(labels)) eques y_pred.shap[1].

Harry040 and others added 2 commits August 9, 2016 11:37
enhance log_loss labels option feature

log_loss

changed test log_loss case

u

add ValueError in log_loss
@indianajensen indianajensen mentioned this pull request Aug 9, 2016
@nelson-liu
Copy link
Contributor

looks like there are conflicts, would you mind rebasing on master?

- The :func: `ignore_warnings` now accept a category argument to ignore only
the warnings of a specified type. By `Thierry Guillemot`_.

- Added Labels flag to `metrics.log_loss` to correct metrics when only one class is present
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowercase labels, and surround it with double backticks (see line 129 for how to format parameters)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also change metrics.log_loss to :class:metrics.log_loss

@nelson-liu
Copy link
Contributor

hmm i'm not sure why past commits are showing up multiple times here..
to rebase on master, first you need to update your local master to mirror the remote, and then run git rebase master on your log_loss_bug_fixed3 branch.

sample_weight : array-like of shape = [n_samples], optional
Sample weights.
.. versionadded:: 0.18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

versionadded goes below the new thing, indented on the same level as If not provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw that one just now. Pushed fix

@indianajensen
Copy link
Contributor Author

indianajensen commented Aug 9, 2016

Thanks @nelson-liu - I think the latest push fixes the issues you pointed out. However, on testing the code, I am not sure if it full resolves all three issues - in particular #4033.
Can you think of anyone from the team who is an expert on the log_loss code and who could help me take a look to make sure we have squashed the bugs before we close off the tickets?

@indianajensen
Copy link
Contributor Author

@nelson-liu - I'll be travelling for the next few days, but let me know if you have any thoughts on the above. From my side, OK to merge as-is (I don't think the code will degrade and there are some improvements), but I am not sure it fully squashes all three bugs so might be good to have a second pair of eyes on this. Let me know your thoughts and I can do some more work on this when I am back.

lb.fit(labels) if labels is not None else lb.fit(y_true)
if labels is None and len(lb.classes_) == 1:
raise ValueError('y_true has only one label. Please provide '
'the true labels explicitly through the labels argument.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look like pep8 to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add another ValueError if len(lb.classes_) == 1 and labels is not None saying that labels should have more than one unique label.

@nelson-liu
Copy link
Contributor

besides git issues / minor cosmetic changes, this LGTM. @MechCoder reviewed the original PR, would you mind taking another look?

raise ValueError('y_true has only one label. Please provide '
'the true labels explicitly through the labels argument.')

T = lb.transform(y_true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not your fault but single letter variable names are not great :-/ feel free to replace.

'the true labels explicitly through the labels argument.')

T = lb.transform(y_true)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel liek the logic below should be changed. It assumes that there are exactly two classes. We can check now whether that's true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which logic are you referring to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 1611 and 1612

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when is lb.transform(X).shape[1] == 1 and len(lb.classes_) > 2?

Copy link
Member

@amueller amueller Aug 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the input is malformed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a check anywhere that len(labels) == y_pred.shape[1]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a check anywhere that len(labels) == y_pred.shape[1]?

Yup we should be checking that as well.

when the input is malformed?

Sorry, I am slow. Could you give an example? :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can T.shape[1] happen now at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I'm slow.

@amueller
Copy link
Member

This seems like the right thing to do. But does it actually fix all the issues? In cross-validation, we don't pass the labels by default, do we?

# because y_true label are the same, there should be an error if the
# labels option has not been used

# error_logloss = log_loss(y_true, y_score)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either remove or uncomment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, i think...

@amueller amueller added this to the 0.18 milestone Aug 22, 2016
@MechCoder
Copy link
Member

@nelson-liu Would you have the time to cherry-pick the commits (or in whatever way you like) onto a new PR? I can do it myself but then we would need one more reviewer to get it merged.

Thanks!

@nelson-liu
Copy link
Contributor

Sure, today might not be great but I'll see if I can squeeze some time while I wait for jobs to run. Else I can do it tomorrow

@nelson-liu
Copy link
Contributor

@MechCoder luckily (?), the cluster i'm working on is down for maintenance. i've pulled the relevant commits from this PR and addressed the issues; waiting for CI tests (at least travis) to pass on my branch before making a PR.

@MechCoder
Copy link
Member

@nelson-liu Thanks! I appreciate the efforts.

@MechCoder
Copy link
Member

Superseded again by #7239

@MechCoder MechCoder closed this Aug 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants