Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG + 2] FIX bug in svm's decision values when decision_function_shape is 'ovr' #7724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Nov 7, 2016

Conversation

btdai
Copy link
Contributor

@btdai btdai commented Oct 22, 2016

Reference Issue

Fixes #6416

What does this implement/fix? Explain your changes.

Changed the sign of dec for the function ovr_decision_function
from
return _ovr_decision_function(dec < 0, dec, len(self.classes))
to
return ovr_decision_function(dec < 0, -dec, len(self.classes))

Reason: for function _ovr_decision_function, the first argument should be consistent with the second argument, so if we take dec < 0, i.e., set True for negative decision value, the confidence that we feed into _ovr_decision_function needs to be inverted as well.

Hence I proposed the above change. Please kindly refer to the examples in issue #6416, thank you very much.

Any other comments?

the sign of dec values should be changed to feed into function _ovr_decision_function

the sign of dec values should be changed to feed into function _ovr_decision_function
@btdai
Copy link
Contributor Author

btdai commented Oct 22, 2016

I hope this PR is correct.

@lesteve
Copy link
Member

lesteve commented Oct 22, 2016

It would be great to add a test that is failing on master and passing in this branch. I think the place where it belongs is sklearn/svm/tests/test_svm.py.

@amueller
Copy link
Member

+1 on the test!

@amueller amueller added this to the 0.18.1 milestone Oct 25, 2016
@amueller amueller changed the title changed line 554 of svm/base.py Fix in svm ovr_decision_function sign Oct 25, 2016
@raghavrv
Copy link
Member

Why not change voting to dec > 0 instead?

@raghavrv
Copy link
Member

raghavrv commented Oct 26, 2016

Actually changing dec > 0 is the correct thing to do here I think... Additionally you need to reverse the order of the classes in the decision function... Currently it is incorrect I think (@amueller please correct me if I am wrong)

For instance, after changing dec to -dec in master

import numpy as np
from sklearn.svm import SVC
from sklearn.utils.testing import assert_true

base_points = np.array([[1, 1], [2, 2]])

X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [1, -1],   # Q3
    base_points * [-1, -1]   # Q4
    ))

y = [0] * 2 + [1] * 2 + [2] * 2 + [3] * 2

svm = SVC(kernel='linear', decision_function_shape='ovr')
svm.fit(X, y)

# One point close to the decision boundary and another far away
base_points = np.array([[5, 5], [10, 10]])

# For all the quadrants (classes)
X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [1, -1],   # Q3
    base_points * [-1, -1]   # Q4
    ))

list(zip(svm.predict(X), svm.decision_function(X)))

Gives

[(0, array([ 0.25,  2.  ,  1.  ,  2.75])),
 (0, array([ 0.5,  2. ,  1. ,  2.5])),
 (1, array([ 2.  ,  0.25,  2.75,  1.  ])),
 (1, array([ 2. ,  0.5,  2.5,  1. ])),
 (2, array([ 2.  ,  2.75,  0.25,  1.  ])),
 (2, array([ 2. ,  2.5,  0.5,  1. ])),
 (3, array([ 2.75,  2.  ,  1.  ,  0.25])),
 (3, array([ 2.5,  2. ,  1. ,  0.5]))]

You can see that for the first prediction, when predicted class is 0, the decision value is higher for class = 3 and not 0.

If you fix it as

return _ovr_decision_function(dec > 0, dec, len(self.classes_))[:, ::-1]

You get

[(0, array([ 3.25,  1.  ,  2.  , -0.25])),
 (0, array([ 3.5,  1. ,  2. , -0.5])),
 (1, array([ 1.  ,  3.25, -0.25,  2.  ])),
 (1, array([ 1. ,  3.5, -0.5,  2. ])),
 (2, array([ 1.  , -0.25,  3.25,  2.  ])),
 (2, array([ 1. , -0.5,  3.5,  2. ])),
 (3, array([-0.25,  1.  ,  2.  ,  3.25])),
 (3, array([-0.5,  1. ,  2. ,  3.5]))]

Which is correct...

And the test should be the above snipped with the last line replaced by

decisions = svm.decision_function(X)

pos_class_decisions = decisions[range(8), svm.predict(X)].reshape([4, 2])

# Test if the point closer to the decision boundary has a lower value
# compared to a poing farther away from the boundary
assert_true(np.all(pos_class_decisions[:, 0] < pos_class_decisions[:, 1]))

# Assert that the predicted class has the maximum value
assert_array_equal(np.argmax(decisions, axis=1), svm.predict(X))

I stupidly tested it on the fixed code ;) Both the tests currently fail in master

@btdai
Copy link
Contributor Author

btdai commented Oct 26, 2016

@raghavrv thank you very much for your reply, so you agree with me that this is indeed a bug, and we shall change to
from
return _ovr_decision_function(dec < 0, dec, len(self.classes))
to either

  1. return ovr_decision_function(dec < 0, -dec, len(self.classes))
    or
  2. return ovr_decision_function(dec > 0, dec, len(self.classes))

I think it should be option 1 and you think it should be option 2.

The reason why I proposed change to option 1 is this: (from line 413 onwards, in master/sklearn/utils/multiclass.py)

k = 0
    for i in range(n_classes):
        for j in range(i + 1, n_classes):
            sum_of_confidences[:, i] -= confidences[:, k]
            sum_of_confidences[:, j] += confidences[:, k]
            votes[predictions[:, k] == 0, i] += 1
            votes[predictions[:, k] == 1, j] += 1
            k += 1

this piece of code aggregate sum of confidence from each of the n_classes * (n_classes -1) / 2 classifiers. For the k^{th} binary classifier, it's between class i and class j, it minus confidence from class i, and add confidence for class j, so the confidence is actually the confidence in class j, right?

However, the dec values is positive for the first class (class i), and negative for the second class (class j), in the ovo case of SVM, so I think we should change the sign of dec, so that class i should receive negative confidence, class j should receive positive confidence, to match the intuition in the _ovr_decision_function defined in master/sklearn/utils/multiclass.py

Now, let's run an example, with plots of original code (first row), option 1 (second row), and option 2 (third row).

import numpy as np
from sklearn import datasets, svm, multiclass
import matplotlib.pyplot as plt

iris = datasets.load_iris()
x = iris.data[:,:2]
y = iris.target

ovo = svm.SVC(kernel='linear')
ovr = svm.SVC(kernel='linear', decision_function_shape='ovr')
ovo.fit(x, y)
ovr.fit(x, y)

test_pts = [()] * 6
test_pts[0] = (5.0, 3.1) # red class nearer to decision boundaries
test_pts[1] = (4.5, 3.5) # red class further from decision boundaries
test_pts[2] = (5.5, 2.6) # green class nearer to decision boundaries
test_pts[3] = (5.0, 2.0) # green class further from decision boundaries
test_pts[4] = (6.5, 3.0) # blue class nearer to decision boundaries
test_pts[5] = (7.5, 3.0) # blue class further from decision boundaries
test_pts = np.array(test_pts)

ovo_deci = ovo.decision_function(test_pts)
ovr_deci = ovr.decision_function(test_pts)
# the line above actually calls multiclass._ovr_decision_function(ovo_deci < 0, ovo_deci, 3)
# I personally believe we should change the sign of ovo_deci as the next line
deci_opt_1 = multiclass._ovr_decision_function(ovo_deci < 0, -ovo_deci, 3)
deci_opt_2 = multiclass._ovr_decision_function(ovo_deci > 0, ovo_deci, 3)

color = ['red', 'green', 'blue']
y_color = [color[i] for i in y]

text_pos = [(5.0,4.5),(5.0,4.8),(7.5,2.1),(7.5,1.8),(7.5,4.8),(7.5,4.5)]

plt.figure(figsize = (18, 18))
for i in range(3):
    plt.subplot(3, 3, i + 1)
    plt.scatter(x[:,0], x[:,1], c=y_color, marker='+')
    plt.scatter(test_pts[2*i:2*i+2,0], test_pts[2*i:2*i+2,1], c=color[i], s = 80)
    for j in range(2*i, 2*i+2):
        plt.annotate('%.3f' % ovr_deci[j,i], xy=test_pts[j], xytext=text_pos[j],arrowprops=dict(facecolor='black', shrink=0.05))
    plt.subplot(3, 3, i + 4)
    plt.scatter(x[:,0], x[:,1], c=y_color, marker='+')
    plt.scatter(test_pts[2*i:2*i+2,0], test_pts[2*i:2*i+2,1], c=color[i], s = 80)
    for j in range(2*i, 2*i+2):
        plt.annotate('%.3f' % deci_opt_1[j,i], xy=test_pts[j], xytext=text_pos[j],arrowprops=dict(facecolor='black', shrink=0.05))
    plt.subplot(3, 3, i + 7)
    plt.scatter(x[:,0], x[:,1], c=y_color, marker='+')
    plt.scatter(test_pts[2*i:2*i+2,0], test_pts[2*i:2*i+2,1], c=color[i], s = 80)
    for j in range(2*i, 2*i+2):
        plt.annotate('%.3f' % deci_opt_2[j,i], xy=test_pts[j], xytext=text_pos[j],arrowprops=dict(facecolor='black', shrink=0.05))

plt.show()

opt_1_or_2

@btdai
Copy link
Contributor Author

btdai commented Oct 26, 2016

@raghavrv Raghav, I tried your code as well, but I got totally different result, I have used your base points, X and y

import numpy as np
from sklearn.svm import SVC
from sklearn.utils.testing import assert_true

base_points = np.array([[1, 1], [2, 2]])

X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [1, -1],   # Q3
    base_points * [-1, -1]   # Q4
    ))

y = [0] * 2 + [1] * 2 + [2] * 2 + [3] * 2

svm = SVC(kernel='linear', decision_function_shape='ovr')
svm.fit(X, y)

# One point close to the decision boundary and another far away
base_points = np.array([[5, 5], [10, 10]])

# For all the quadrants (classes)
X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [1, -1],   # Q3
    base_points * [-1, -1]   # Q4
    ))

list(zip(svm.predict(X), svm.decision_function(X)))

ovo = SVC(kernel='linear', decision_function_shape='ovo')
ovo.fit(X, y)
ovo_deci = ovo.decision_function(X)
from sklearn import multiclass
deci_opt_1 = multiclass._ovr_decision_function(ovo_deci < 0, -ovo_deci, 4)
deci_opt_2 = multiclass._ovr_decision_function(ovo_deci > 0, ovo_deci, 4)

print(deci_opt_1)
print([np.argmax(i) for i in deci_opt_1])
print(deci_opt_2)
print([np.argmax(i) for i in deci_opt_2])

The option 1 and 2 gives:

>>> print(deci_opt_1)
[[ 3.25  2.    1.   -0.25]
 [ 3.5   2.    1.   -0.5 ]
 [ 2.    3.25 -0.25  1.  ]
 [ 2.    3.5  -0.5   1.  ]
 [ 2.   -0.25  3.25  1.  ]
 [ 2.   -0.5   3.5   1.  ]
 [-0.25  2.    1.    3.25]
 [-0.5   2.    1.    3.5 ]]
>>> print([np.argmax(i) for i in deci_opt_1])
[0, 0, 1, 1, 2, 2, 3, 3]
>>> print(deci_opt_2)
[[-0.25  2.    1.    3.25]
 [-0.5   2.    1.    3.5 ]
 [ 2.   -0.25  3.25  1.  ]
 [ 2.   -0.5   3.5   1.  ]
 [ 2.    3.25 -0.25  1.  ]
 [ 2.    3.5  -0.5   1.  ]
 [ 3.25  2.    1.   -0.25]
 [ 3.5   2.    1.   -0.5 ]]
>>> print([np.argmax(i) for i in deci_opt_2])
[3, 3, 2, 2, 1, 1, 0, 0]

when we use np.argmax, it should be consistent with y. So I think option 1 -- changing sign of dec -- is right, what do you think?

@raghavrv
Copy link
Member

@btdai Thanks for the patient response. I think you are right! Option 1 is the correct way to go here. Apologies for the confusion...

The test should be along the lines of

import numpy as np
from sklearn.svm import SVC
from sklearn.utils.testing import assert_true
from sklearn.utils.testing import assert_array_equal

base_points = np.array([[1, 1], [2, 2], [2, 1], [1, 2]])

X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [-1, -1],  # Q3
    base_points * [1, -1]    # Q4
    ))

y = [1] * 4 + [2] * 4 + [3] * 4 + [4] * 4

# One point close to the decision boundary and another far away
base_points = np.array([[5, 5], [10, 10]])

# For all the quadrants (classes)
X_test = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [-1, -1],  # Q3
    base_points * [1, -1],   # Q4
    ))

svm = SVC(kernel='linear', decision_function_shape='ovr')
svm.fit(X, y)

decisions = svm.decision_function(X_test)

# Subtract 1 from predictions to get indices for that class (Quadrant)
pos_class_decisions = decisions[range(8), svm.predict(X_test) - 1].reshape([4, 2])

# Test if the point closer to the decision boundary has a lower value
# compared to a poing farther away from the boundary
assert_true(np.all(pos_class_decisions[:, 0] < pos_class_decisions[:, 1]))

# Assert that the predicted class has the maximum value
# Add 1 to convert argmax positions to class labels (Quadrants)
assert_array_equal(np.argmax(decisions, axis=1) + 1, svm.predict(X_test))

# [1000, 0.1] is closer to Q4 compared to Q2/Q3
confidences = svm.decision_function([[1000, 0.1]])[0]

confidence_Q4 = confidences[4 - 1]
confidence_Q3 = confidences[3 - 1]
confidence_Q2 = confidences[2 - 1]

assert_true(confidence_Q4 > confidence_Q3)
assert_true(confidence_Q4 > confidence_Q2)
assert_true(confidence_Q2 > confidence_Q3)

If you have a better test, please feel free to use that! :)

This test fails in master as well as with my (incorrect) fix...

@amueller
Copy link
Member

I don't understand how this test tests the right thing. We want a tie between the hard predicted classes on the test point, right? So you should compare [.1, .1] and [1, 1] in all the quadrants and see that the decision function for that quadrant is higher further away from 0,0.
Also, I feel like you overcomplicate things by having four points per class. why not one?

@raghavrv
Copy link
Member

So you should compare [.1, .1] and [1, 1] in all the quadrants and see that the decision function for that quadrant is higher further away from 0,0.

It instead checks

  1. If the decision value for [5, 5] is less than that for [10, 10] in all quadrants. This is not the case currently in master but fixed in this PR... (For 5 decision value is 3 - 0.25 and for 10, decision value is 3 - 0.5)
  2. If the decision value for [1000, 0.1] (which causes a tie in some classes but not all), indicates that it lies in Quadrant 1, is closer to Q4, than to Q2/Q3. This also is not the case in master and is fixed by this PR.

I feel like you over complicate things by having four points per class. why not one?

True we need not have 4 points to train. Just the [-1, 1] etc would suffice...

@btdai
Copy link
Contributor Author

btdai commented Oct 26, 2016

Thank you @raghavrv , I think your test case is good, but I'm still new to scikit-learn community, I'm not sure if @amueller Andreas' idea is to have one test case added into test case. I think your test case is good, but I made X and X_test of the same length, so we can compare the array predictions with y directly. I have also changed the class label, so that we do not need to add or subtract when comparing.

This piece of code can't run on my scikit-learn (mine is not developer version), could you help me run see if the current code runs on your development mode.

Also, I think we should assert the deci_vals for positive class greater than 0 as well.

import numpy as np
from sklearn.svm import SVC
from sklearn.utils.testing import assert_true
from sklearn.utils.testing import assert_array_equal

base_points = np.array([[1, 2], [2, 1]])

X = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [-1, -1],  # Q3
    base_points * [1, -1]    # Q4
    ))

y = [0] * 2 + [1] * 2 + [2] * 2 + [3] * 2

# One point close to the decision boundary and another far away
base_points = np.array([[5, 5], [10, 10]])

# For all the quadrants (classes)
X_test = np.vstack((
    base_points * [1, 1],    # Q1
    base_points * [-1, 1],   # Q2
    base_points * [-1, -1],  # Q3
    base_points * [1, -1],   # Q4
    ))

svm = SVC(kernel='linear', decision_function_shape='ovr')
svm.fit(X, y)

predictions = svm.predict(X_test)

# Test if the prediction is the same as y
assert_array_equal(predictions, y)

deci_vals = svm.decision_function(X_test)

# Assert that the predicted class has the maximum value
assert_array_equal(np.argmax(deci_vals, axis=1), svm.predict(X_test))

# Get decision value at test points for the predicted class
pos_class_deci_vals = deci_vals[range(8), svm.predict(X_test)].reshape([4, 2])

# we shall also assert pos_class_deci_vals > 0 here

# Test if the point closer to the decision boundary has a lower decision value
# compared to a point farther away from the boundary
assert_true(np.all(pos_class_deci_vals[:, 0] < pos_class_deci_vals[:, 1]))

@amueller
Copy link
Member

@raghavrv ok, I didn't entirely understand your test case. I'm good with just including it, commenting it thoroughly and making it as succinct as possible.

@raghavrv raghavrv changed the title Fix in svm ovr_decision_function sign [WIP] FIX bug in svm's decision values when decision_function_shape is 'ovr' Oct 30, 2016
@amueller
Copy link
Member

amueller commented Nov 1, 2016

@btdai would you like to finish this up? I'd like to get this into the 0.18.1 release and it's the last blocking issue. Otherwise maybe @raghavrv wants to pick it up?

@btdai
Copy link
Contributor Author

btdai commented Nov 2, 2016

@amueller I'm sorry to be the last one blocking this issue. When would you like to release 0.18.1 ?

@btdai
Copy link
Contributor Author

btdai commented Nov 2, 2016

sorry to ask again, I see the there's test_multiclass.py in master/sklearn/tests, do I need to create another test just for this function?

@jnothman
Copy link
Member

jnothman commented Nov 2, 2016

would like to release ASAP. The test should be in sklearn/svm/tests/test_svm.py

@btdai
Copy link
Contributor Author

btdai commented Nov 2, 2016

Hi, I have created this PR: #7810

it added the test file test_ovr_decision_function.py in sklearn/svm/tests

Please review this PR, thank you.

@lesteve
Copy link
Member

lesteve commented Nov 2, 2016

You need to add the test in this PR and not create a separate PR, I will close #7810. Also could you add the tests in sklearn/svm/tests/test_svm.py as @jnothman mentioned above and not in a separate file.

For this you just need to create an additional commit in this branch i.e. btdai_patch_ovr_decision_function and then use git push to update the branch on your fork. Sorry if it is not clear enough, maybe try to read again the links about github that we gave you before. Do say if you get completely stuck though.

@btdai
Copy link
Contributor Author

btdai commented Nov 2, 2016

@lesteve thank you for your comments, yah, i should have just added the test case in this branch.

I've now added it, seems it's going through some automatic checks.

@lesteve
Copy link
Member

lesteve commented Nov 2, 2016

Please add your test_ovr_decision_function function in sklearn/svm/tests/test_svm.py and do not create a new file (sklearn/svm/tests/test_ovr_decision_function.py).

Also Travis is failing because of flake8 issues, if you are on Linux or OSX you can run bash build_tools/travis/flake8_diff.sh locally and fix the errors. They are mostly unused imports and whitespace issues.

@raghavrv raghavrv changed the title [MRG + 1] FIX bug in svm's decision values when decision_function_shape is 'ovr' [MRG + 2] FIX bug in svm's decision values when decision_function_shape is 'ovr' Nov 6, 2016
@raghavrv
Copy link
Member

raghavrv commented Nov 6, 2016

Needs a rebase. Then good for merge. Thanks @btdai

@jnothman
Copy link
Member

jnothman commented Nov 7, 2016

Sorry, still needs a rebase.

@jnothman
Copy link
Member

jnothman commented Nov 7, 2016

i.e. you probably need to update your local copy of master, then merge or rebase with it.

@btdai
Copy link
Contributor Author

btdai commented Nov 7, 2016

ya, git rebase btdai_patch_....
always give me error

@btdai
Copy link
Contributor Author

btdai commented Nov 7, 2016

bingtian@ASPIRES7:~/git$ git rebase btdai_patch_ovr_decision_function
Current branch btdai_patch_ovr_decision_function is up to date.

rebase says up to date

so I push with
git push https://github.com/btdai/scikit-learn.git btdai_patch_ovr_decision_function

Still can't make it

@jnothman
Copy link
Member

jnothman commented Nov 7, 2016

You want something like:

git checkout master
gut pull https://github.com/scikit-learn/scikit-learn master
git checkout btdai_patch_ovr_decision_function
git merge master  # or git rebase master, but I think merge should work
# resolve merge conflicts and add conflicted files, then
git commit
git push

@btdai
Copy link
Contributor Author

btdai commented Nov 7, 2016

Hi, I git rebase with -i

but then I tried to push, it said "Everything up-to-date"

is it normal?

Thanks.

Please help @amueller @raghavrv @jnothman

bingtian@ASPIRES7:/git$ git rebase -i btdai_patch_ovr_decision_function
Successfully rebased and updated refs/heads/btdai_patch_ovr_decision_function.
bingtian@ASPIRES7:
/git$ git push https://github.com/btdai/scikit-learn.git btdai_patch_ovr_decision_function
Username for 'https://github.com': btdai
Password for 'https://[email protected]':
Everything up-to-date
bingtian@ASPIRES7:~/git$ git push --force https://github.com/btdai/scikit-learn.git btdai_patch_ovr_decision_function
Username for 'https://github.com': btdai
Password for 'https://[email protected]':
Everything up-to-date

…into btdai_patch_ovr_decision_function

Conflicts:
	doc/whats_new.rst
@btdai
Copy link
Contributor Author

btdai commented Nov 7, 2016

Ok, I should have done

git merge master

Thank you @jnothman

@jnothman
Copy link
Member

jnothman commented Nov 7, 2016

git rebase master would have worked fine, together with a force-push. We are used to asking for rebases, because it was more necessary before some (relatively) recent changes to github. I think merges are fine now.

@@ -4881,8 +4886,12 @@ David Huard, Dave Morrill, Ed Schofield, Travis Oliphant, Pearu Peterson.

.. _Peng Meng: https://github.com/mpjlu

<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've not correctly resolved merge conflicts

@jnothman jnothman merged commit 46001a0 into scikit-learn:master Nov 7, 2016
@jnothman
Copy link
Member

jnothman commented Nov 7, 2016

Thanks, @btdai!

@btdai
Copy link
Contributor Author

btdai commented Nov 7, 2016

thank you for your help, @jnothman

I can work on the next bug now :)

@btdai btdai deleted the btdai_patch_ovr_decision_function branch November 8, 2016 06:39
amueller pushed a commit to amueller/scikit-learn that referenced this pull request Nov 9, 2016
sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ovr decision_function calculation from ovo case
5 participants