Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] Update class_weight.py #8319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 21, 2017

Conversation

MMeketon
Copy link
Contributor

@MMeketon MMeketon commented Feb 8, 2017

closes #8312
Changed ValueError message to reflect that a class label may be non-integer

Reference Issue #8312

What does this implement/fix? Explain your changes.

The original line ValueError("Class label %d not present." %c) implicitly assumed the class label c was an integer. The fix ValueError("Class label {} not present.".format(c)) allows the class label to be a more general type, such as a string

Any other comments?

This is my first ever attempt to fix anything in "open source" using github. Let me know if I goofed in the process.

Changed ValueError message to reflect that a class label may be non-integer
@codecov
Copy link

codecov bot commented Feb 8, 2017

Codecov Report

Merging #8319 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8319      +/-   ##
==========================================
+ Coverage   94.73%   94.75%   +0.02%     
==========================================
  Files         342      342              
  Lines       60674    60893     +219     
==========================================
+ Hits        57482    57702     +220     
+ Misses       3192     3191       -1
Impacted Files Coverage Δ
sklearn/utils/tests/test_class_weight.py 100% <100%> (ø)
sklearn/utils/class_weight.py 100% <100%> (ø)
sklearn/pipeline.py 99.26% <ø> (-0.36%)
sklearn/linear_model/ridge.py 93.88% <ø> (-0.02%)
sklearn/datasets/init.py 100% <ø> (ø)
sklearn/metrics/classification.py 97.77% <ø> (ø)
sklearn/linear_model/tests/test_ridge.py 100% <ø> (ø)
sklearn/neighbors/tests/test_kd_tree.py 97.45% <ø> (ø)
sklearn/linear_model/tests/test_randomized_l1.py 100% <ø> (ø)
sklearn/ensemble/tests/test_weight_boosting.py 100% <ø> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 542c02b...268fd9f. Read the comment docs.

@TomDLT TomDLT changed the title Update class_weight.py [MRG+1] Update class_weight.py Feb 8, 2017
Copy link
Member

@TomDLT TomDLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jnothman
Copy link
Member

jnothman commented Feb 8, 2017 via email

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 8, 2017 via email

@jnothman
Copy link
Member

jnothman commented Feb 8, 2017 via email

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 9, 2017 via email

@lesteve
Copy link
Member

lesteve commented Feb 15, 2017

@MMeketon have you made progress on the test? It seems like a useful fix.

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 15, 2017 via email

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 16, 2017 via email

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 16, 2017 via email

@lesteve
Copy link
Member

lesteve commented Feb 16, 2017

I just added the test.

It is not part of the diff. You probably haven't pushed your commit to your branch then.

@MMeketon
Copy link
Contributor Author

MMeketon commented Feb 17, 2017 via email

@jnothman
Copy link
Member

No, nothing's changed.

@lesteve
Copy link
Member

lesteve commented Feb 20, 2017

You can look at the diff yourself clicking on the "Files changed" tab, which you can access through this URL:
https://github.com/scikit-learn/scikit-learn/pull/8319/files

…und in the class_weights dictionary

Previously when class_weights were used, a check was made to see if
there were elements in the class_weights dictionary that were not in the
classes array.  If so, a ValueError was raised.  But there was a bug in
the formatting of the ValueError that assumed the classes were integer
and the ValueErorr was not raised when the classes were not numbers.
This test is to verify that the ValueError is properly raised.
classes = np.unique(y)
class_weights = {c: 1.0 for c in classes}
class_weights['D'] = 1.0 # This should get a proper ValueError
cw = assert_raises(ValueError, compute_class_weight, class_weights,
Copy link
Member

@lesteve lesteve Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done for going through your git/github initiation! There are a few ways you can simplify this test. One is to
reuse test_compute_class_weight_not_present and add your test after this line rather than creating a new function .

I think that just adding this should be enough:

# Fix exception in error message formatting when missing label is a string
# https://github.com/scikit-learn/scikit-learn/issues/8312
assert_raise_message(ValueError,
                     'Class label label_not_present not present',
                     compute_class_weight,
                     {'label_not_present': 1.}, classes, y)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used test_compute_class_weight_not_present as a template for the test I introduced, but did not want to change it. I'm personally OK with multiple asserts in a single test case, but others (e.g., Robert Martin, "Clean Code", Ch 9) are against it, so I didn't want to upset others with multiple asserts.

I did incorporate your suggestion of using assert_raise_message which is an excellent idea - thank you.

…layed

A big thank you to Loïc Estève for pointing out that an
assert_raises_message is a better test than assert_raises.
Copy link
Contributor Author

@MMeketon MMeketon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made changes to make travis happier, and changed assert_raise to assert_raise_message.

@lesteve
Copy link
Member

lesteve commented Feb 21, 2017

I pushed the change I mentioned in #8319 (comment). I'll wait for the CIs to be green and merge this one. @MMeketon please ping me if I forget to do it.

@lesteve lesteve merged commit a0db45d into scikit-learn:master Feb 21, 2017
@lesteve
Copy link
Member

lesteve commented Feb 21, 2017

Merged, thanks a lot @MMeketon!

sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
lemonlaug pushed a commit to lemonlaug/scikit-learn that referenced this pull request Jan 6, 2021
when the missing class label is a string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error in error message in compute_class_weight
4 participants