-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Travis error looks genuine and happens on the test you have added. There are flake8 errors as well that you need to fix. |
Ok, done. Sorry for double commit. I've pushed not finished amendments by mistake. |
It looks like github does not recognise your email. You probably need to have the same email for your git settings and for github. This makes it easier to track who changes what. |
Codecov Report
@@ Coverage Diff @@
## master #8371 +/- ##
==========================================
+ Coverage 94.75% 94.75% +<.01%
==========================================
Files 342 342
Lines 60801 60807 +6
==========================================
+ Hits 57609 57615 +6
Misses 3192 3192
Continue to review full report at Codecov.
|
e42d3ef
to
9a386a2
Compare
sklearn/ensemble/weight_boosting.py
Outdated
normalizer[normalizer == 0.0] = 1.0 | ||
proba /= normalizer | ||
|
||
if n_classes > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should move the n_classes == 1 sooner to avoid to do non meaningful computation, e.g. something like:
...
X = self._validate_X_predict(X)
if n_classes == 1:
return np.ones((X.shape[0], 1))
... # the rest of the code can stay exactly the same without an else clause
@@ -74,6 +74,13 @@ def predict_proba(self, X): | |||
assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1]) | |||
|
|||
|
|||
def test_oneclass_proba(): | |||
# Test `predict_proba` robustness for one class label input. | |||
y_t = np.ones((len(X),)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y_t = np.ones(len(X))
is the same and easier to read.
@@ -74,6 +74,13 @@ def predict_proba(self, X): | |||
assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1]) | |||
|
|||
|
|||
def test_oneclass_proba(): | |||
# Test `predict_proba` robustness for one class label input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a reference to the original issue i.e. https://github.com/scikit-learn/scikit-learn/issues/7501
.
Nitpick: I don't think the backquotes around predict_proba are really useful.
@@ -74,6 +74,13 @@ def predict_proba(self, X): | |||
assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1]) | |||
|
|||
|
|||
def test_oneclass_proba(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put adaboost somewhere in the test function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the minor comment LGTM.
You probably want to add an entry in doc/whats_new.rst as well.
sklearn/ensemble/weight_boosting.py
Outdated
@@ -770,7 +773,6 @@ def predict_proba(self, X): | |||
normalizer = proba.sum(axis=1)[:, np.newaxis] | |||
normalizer[normalizer == 0.0] = 1.0 | |||
proba /= normalizer | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good practice: do not change code if there is not a good reason to. In this case can you please put the newline where it was?
@dokato can you tackle #8371 (comment) and add an entry in doc/whats_new.rst? |
@lesteve Yes, I did it, but I'm struggling with strange problem. I've pushed my change, but it's not visible at github. When I make |
I have seen problems like this when github were having operational issues, but that does not seem to be the case at the moment. I just pushed in my fork master branch and I could see the update right away. Maybe try to |
2cb2529
to
fc337c5
Compare
Codecov Report
@@ Coverage Diff @@
## master #8371 +/- ##
==========================================
+ Coverage 94.75% 94.75% +<.01%
==========================================
Files 342 342
Lines 60813 60819 +6
==========================================
+ Hits 57621 57627 +6
Misses 3192 3192
Continue to review full report at Codecov.
|
Amending and pushing again helped, thanks for suggestion! |
@@ -153,11 +153,14 @@ Enhancements | |||
Bug fixes | |||
......... | |||
|
|||
- Fixed a bug where :class:`sklearn.ensemble.AdaBoostClassifier` throws | |||
``ZeroDivisionError`` while fitting data with single class labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beware this is a rst file, rst and markdown do not have the same syntax. I changed the single back-quotes in double back-quotes.
LGTM, I'll merge when the CIs are green. Ping me if I forget. |
@lesteve just to remind you to merge. |
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst
Reference Issue
What does this implement/fix? Explain your changes.
Fixes #7501. As suggested returns vector of ones when only one class.
Any other comments?
Tests added as well.