[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371

dokato · 2017-02-16T10:05:31Z

Reference Issue

What does this implement/fix? Explain your changes.

Fixes #7501. As suggested returns vector of ones when only one class.

Any other comments?

Tests added as well.

lesteve · 2017-02-16T10:35:30Z

The Travis error looks genuine and happens on the test you have added. There are flake8 errors as well that you need to fix.

dokato · 2017-02-16T11:46:12Z

Ok, done. Sorry for double commit. I've pushed not finished amendments by mistake.

lesteve · 2017-02-16T12:15:14Z

It looks like github does not recognise your email. You probably need to have the same email for your git settings and for github. This makes it easier to track who changes what.

codecov · 2017-02-16T12:15:33Z

Codecov Report

Merging #8371 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8371      +/-   ##
==========================================
+ Coverage   94.75%   94.75%   +<.01%     
==========================================
  Files         342      342              
  Lines       60801    60807       +6     
==========================================
+ Hits        57609    57615       +6     
  Misses       3192     3192

Impacted Files	Coverage Δ
sklearn/ensemble/tests/test_weight_boosting.py	`100% <100%> (ø)`	✅
sklearn/ensemble/weight_boosting.py	`96.49% <100%> (+0.02%)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80c1bf1...ad9227a. Read the comment docs.

…rected

lesteve · 2017-02-16T15:06:08Z

sklearn/ensemble/weight_boosting.py

-        normalizer[normalizer == 0.0] = 1.0
-        proba /= normalizer
-
+        if n_classes > 1:


I think you should move the n_classes == 1 sooner to avoid to do non meaningful computation, e.g. something like:

... X = self._validate_X_predict(X) if n_classes == 1: return np.ones((X.shape[0], 1)) ... # the rest of the code can stay exactly the same without an else clause

lesteve · 2017-02-16T15:07:10Z

sklearn/ensemble/tests/test_weight_boosting.py

@@ -74,6 +74,13 @@ def predict_proba(self, X):
    assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1])


+def test_oneclass_proba():
+    # Test `predict_proba` robustness for one class label input.
+    y_t = np.ones((len(X),))


y_t = np.ones(len(X)) is the same and easier to read.

lesteve · 2017-02-16T15:10:59Z

sklearn/ensemble/tests/test_weight_boosting.py

@@ -74,6 +74,13 @@ def predict_proba(self, X):
    assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1])


+def test_oneclass_proba():
+    # Test `predict_proba` robustness for one class label input.


Maybe add a reference to the original issue i.e. https://github.com/scikit-learn/scikit-learn/issues/7501.

Nitpick: I don't think the backquotes around predict_proba are really useful.

lesteve · 2017-02-16T15:11:56Z

sklearn/ensemble/tests/test_weight_boosting.py

@@ -74,6 +74,13 @@ def predict_proba(self, X):
    assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1])


+def test_oneclass_proba():


put adaboost somewhere in the test function.

lesteve

Besides the minor comment LGTM.

You probably want to add an entry in doc/whats_new.rst as well.

lesteve · 2017-02-16T21:51:35Z

sklearn/ensemble/weight_boosting.py

@@ -770,7 +773,6 @@ def predict_proba(self, X):
        normalizer = proba.sum(axis=1)[:, np.newaxis]
        normalizer[normalizer == 0.0] = 1.0
        proba /= normalizer
-


Good practice: do not change code if there is not a good reason to. In this case can you please put the newline where it was?

lesteve · 2017-02-20T08:20:47Z

@dokato can you tackle #8371 (comment) and add an entry in doc/whats_new.rst?

dokato · 2017-02-20T09:12:43Z

@lesteve Yes, I did it, but I'm struggling with strange problem. I've pushed my change, but it's not visible at github. When I make git diff adaboost_zerodiverror origin/adaboost_zerodiverror there is no difference however. Plus I cloned my fork to different folder again, changed to appropriate branch and my commit is there in logs! Never have I seen something like this before, have you?

lesteve · 2017-02-20T09:53:23Z

Plus I cloned my fork to different folder again, changed to appropriate branch and my commit is there in logs! Never have I seen something like this before, have you?

I have seen problems like this when github were having operational issues, but that does not seem to be the case at the moment. I just pushed in my fork master branch and I could see the update right away.

Maybe try to git commit --amend + git push -f to see whether that was a one-off glitch. Also double-check that you have not missed something, which happens to the best of us sometimes ;-).

codecov-io · 2017-02-20T09:59:16Z

Codecov Report

Merging #8371 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8371      +/-   ##
==========================================
+ Coverage   94.75%   94.75%   +<.01%     
==========================================
  Files         342      342              
  Lines       60813    60819       +6     
==========================================
+ Hits        57621    57627       +6     
  Misses       3192     3192

Impacted Files	Coverage Δ
sklearn/ensemble/weight_boosting.py	`96.49% <100%> (+0.02%)`	✅
sklearn/ensemble/tests/test_weight_boosting.py	`100% <100%> (ø)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 41200e1...0deb2a3. Read the comment docs.

dokato · 2017-02-20T10:08:02Z

Amending and pushing again helped, thanks for suggestion!

lesteve · 2017-02-20T10:13:17Z

doc/whats_new.rst

@@ -153,11 +153,14 @@ Enhancements
 Bug fixes
 .........

+   - Fixed a bug where :class:`sklearn.ensemble.AdaBoostClassifier` throws
+     ``ZeroDivisionError`` while fitting data with single class labels.


Beware this is a rst file, rst and markdown do not have the same syntax. I changed the single back-quotes in double back-quotes.

lesteve · 2017-02-20T10:15:06Z

LGTM, I'll merge when the CIs are green. Ping me if I forget.

dokato · 2017-02-20T18:30:03Z

@lesteve just to remind you to merge.

lesteve · 2017-02-21T08:26:45Z

Thanks a lot @dokato for the PR and thanks @jnothman for merging this one.

…ikit-learn#8371) * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected * FIX scikit-learn#7501 improvements suggested by lesteve introduced * FIX scikit-learn#7501 whats_new file updated * Tweak in rst

dokato added 3 commits February 16, 2017 14:00

FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

98ad534

FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests cor…

1816bc5

…rected

FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests cor…

9a386a2

…rected

dokato force-pushed the adaboost_zerodiverror branch from e42d3ef to 9a386a2 Compare February 16, 2017 13:01

lesteve requested changes Feb 16, 2017

View reviewed changes

FIX scikit-learn#7501 improvements suggested by lesteve introduced

ad9227a

lesteve approved these changes Feb 16, 2017

View reviewed changes

lesteve changed the title ~~FIX AdaBoost ZeroDivisionError in proba #7501~~ [MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 Feb 16, 2017

FIX scikit-learn#7501 whats_new file updated

fc337c5

dokato force-pushed the adaboost_zerodiverror branch from 2cb2529 to fc337c5 Compare February 20, 2017 09:59

Merge branch 'master' into adaboost_zerodiverror

b527be1

Tweak in rst

0deb2a3

lesteve reviewed Feb 20, 2017

View reviewed changes

jnothman merged commit fb65a0a into scikit-learn:master Feb 20, 2017

Przemo10 mentioned this pull request Mar 17, 2017

update fork (#1) #8606

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371

[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371

dokato commented Feb 16, 2017

lesteve commented Feb 16, 2017

dokato commented Feb 16, 2017

lesteve commented Feb 16, 2017

codecov bot commented Feb 16, 2017 •

edited

Loading

lesteve Feb 16, 2017

lesteve Feb 16, 2017

lesteve Feb 16, 2017

lesteve Feb 16, 2017

lesteve left a comment •

edited

Loading

lesteve Feb 16, 2017

lesteve commented Feb 20, 2017

dokato commented Feb 20, 2017

lesteve commented Feb 20, 2017

codecov-io commented Feb 20, 2017 •

edited

Loading

dokato commented Feb 20, 2017

lesteve Feb 20, 2017

lesteve commented Feb 20, 2017

dokato commented Feb 20, 2017

lesteve commented Feb 21, 2017

		@@ -74,6 +74,13 @@ def predict_proba(self, X):
		assert_array_equal(np.argmax(samme_proba, axis=1), [0, 1, 1, 1])


		def test_oneclass_proba():

[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371

[MRG+1] FIX AdaBoost ZeroDivisionError in proba #7501 #8371

Conversation

dokato commented Feb 16, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

lesteve commented Feb 16, 2017

dokato commented Feb 16, 2017

lesteve commented Feb 16, 2017

codecov bot commented Feb 16, 2017 • edited Loading

Codecov Report

lesteve Feb 16, 2017

Choose a reason for hiding this comment

lesteve Feb 16, 2017

Choose a reason for hiding this comment

lesteve Feb 16, 2017

Choose a reason for hiding this comment

lesteve Feb 16, 2017

Choose a reason for hiding this comment

lesteve left a comment • edited Loading

Choose a reason for hiding this comment

lesteve Feb 16, 2017

Choose a reason for hiding this comment

lesteve commented Feb 20, 2017

dokato commented Feb 20, 2017

lesteve commented Feb 20, 2017

codecov-io commented Feb 20, 2017 • edited Loading

Codecov Report

dokato commented Feb 20, 2017

lesteve Feb 20, 2017

Choose a reason for hiding this comment

lesteve commented Feb 20, 2017

dokato commented Feb 20, 2017

lesteve commented Feb 21, 2017

codecov bot commented Feb 16, 2017 •

edited

Loading

lesteve left a comment •

edited

Loading

codecov-io commented Feb 20, 2017 •

edited

Loading