[MRG+1] fix plot_partial_dependence not taking target into account when multiclass #14393

GuillemGSubies · 2019-07-17T14:00:00Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

I just took out an else so target_idx does not get overwritten.

Any other comments?

I didn't know what was the optimal way to test it. Right now I check the y axis and make sure that they are not the same (the bug made them equals all the time).

Also, I have a question: Here it should be int or str, shouldn't it?

scikit-learn/sklearn/inspection/partial_dependence.py

Line 404 in c0c5313

target : int, optional (default=None)

amueller · 2019-07-17T15:51:49Z

cc @NicolasHug

glemaitre

Otherwise LGTM

sklearn/inspection/tests/test_partial_dependence.py

glemaitre · 2019-07-18T08:00:47Z

Also, I have a question: Here it should be int or str, shouldn't it?

If y contains str then estimator.classes_ should contain str as well so you should be right.

glemaitre · 2019-07-18T08:08:19Z

You can add a test for it in fact (in another PR). Quickly the test should be something like:

from sklearn.datasets import fetch_openml
from sklearn.tree import DecisionTreeClassifier
from sklearn.inspection import plot_partial_depedence

iris = fetch_openml('iris', as_frame=True, version=1)
df, y = iris.data, iris.target.to_numpy()
clf = DecisionTreeClassifier().fit(df, y)
assert dtype(clf.classes_) == 'object'
# check that the pdp with str and int give the same results
# pick-up the last class
# implement the assert as in this PR
plot_partial_dependence(clf, df, [0], target='Iris-viriginica')
plot_partial_dependence(clf, df, [0], target=2)

GuillemGSubies · 2019-07-18T08:50:34Z

You can add a test for it in fact (in another PR). Quickly the test should be something like:

from sklearn.datasets import fetch_openml
from sklearn.tree import DecisionTreeClassifier
from sklearn.inspection import plot_partial_depedence

iris = fetch_openml('iris', as_frame=True, version=1)
df, y = iris.data, iris.target.to_numpy()
clf = DecisionTreeClassifier().fit(df, y)
assert dtype(clf.classes_) == 'object'
# check that the pdp with str and int give the same results
# pick-up the last class
# implement the assert as in this PR
plot_partial_dependence(clf, df, [0], target='Iris-viriginica')
plot_partial_dependence(clf, df, [0], target=2)

Actually I asked because there is already a test about that

scikit-learn/sklearn/inspection/tests/test_partial_dependence.py

Line 503 in 67130a5

target='setosa',

glemaitre · 2019-07-18T12:12:40Z

Actually I asked because there is already a test about that

Oh perfect then, so no need for an additional test ;)

NicolasHug

Small comments but LGTM anywway.

Thanks for the fix @GuillemGSubies !

NicolasHug · 2019-07-18T12:18:39Z

sklearn/inspection/tests/test_partial_dependence.py

+    # check that the pd plots are the same for 0 and "setosa"
+    assert all(axs[0].lines[0]._y == axs2[0].lines[0]._y)
+    # check that the pd plots are different for another target
+    clf = GradientBoostingClassifier(n_estimators=10, random_state=1)


I think you can remove a few lines, namely the clf definition and fitting, as well as the grid_resolution.

Ok, I will. I did not change those because I did not know if it had to be with some standard you use when testing.

doc/whats_new/v0.21.rst

NicolasHug

Small comments but LGTM anywway.

Thanks for the fix @GuillemGSubies !

Co-Authored-By: Nicolas Hug <[email protected]>

…ithub.com/GuillemGSubies/scikit-learn into bugfix_partial_dependence_plot_multiclass

NicolasHug · 2019-07-18T17:27:59Z

sklearn/inspection/tests/test_partial_dependence.py

+    # check that the pd plots are the same for 0 and "setosa"
+    assert all(axs[0].lines[0]._y == axs2[0].lines[0]._y)
+    # check that the pd plots are different for another target
+    clf.fit(iris.data, iris.target)


you can remove this line too ;)

Looks like I shouldn't have removed it. That means that if I train using the targets as strings, I cannot pass an int to plot_partial_dependence
Don't know if that is the expected behavior or not

Oh OK, my bad, I didn't realize it was fit on something different before

glemaitre · 2019-07-19T07:59:50Z

Thanks @GuillemGSubies

…class (scikit-learn#14393)

guillem.garcia added 2 commits July 17, 2019 16:04

solved_#14301

33a04a9

Solved tests not working

e769647

GuillemGSubies marked this pull request as ready for review July 17, 2019 14:36

GuillemGSubies changed the title ~~[WIP] plot_partial_dependence taking multiclass into account~~ [MRG] plot_partial_dependence taking multiclass into account Jul 17, 2019

GuillemGSubies changed the title ~~[MRG] plot_partial_dependence taking multiclass into account~~ [MRG] fix plot_partial_dependence not taking multiclass into account Jul 17, 2019

GuillemGSubies changed the title ~~[MRG] fix plot_partial_dependence not taking multiclass into account~~ [MRG] fix plot_partial_dependence not taking targe into account when multiclass Jul 17, 2019

GuillemGSubies changed the title ~~[MRG] fix plot_partial_dependence not taking targe into account when multiclass~~ [MRG] fix plot_partial_dependence not taking target into account when multiclass Jul 17, 2019

glemaitre approved these changes Jul 18, 2019

View reviewed changes

sklearn/inspection/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

sklearn/inspection/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

glemaitre added this to the 0.21.3 milestone Jul 18, 2019

Better documented tests

71ad84f

glemaitre changed the title ~~[MRG] fix plot_partial_dependence not taking target into account when multiclass~~ [MRG+1] fix plot_partial_dependence not taking target into account when multiclass Jul 18, 2019

NicolasHug approved these changes Jul 18, 2019

View reviewed changes

GuillemGSubies and others added 3 commits July 18, 2019 16:22

Update doc/whats_new/v0.21.rst

489fcbd

Co-Authored-By: Nicolas Hug <[email protected]>

Less repeated code

cde6b1d

Merge branch 'bugfix_partial_dependence_plot_multiclass' of https://g…

fe332de

…ithub.com/GuillemGSubies/scikit-learn into bugfix_partial_dependence_plot_multiclass

NicolasHug reviewed Jul 18, 2019

View reviewed changes

GuillemGSubies added 2 commits July 18, 2019 19:47

Deleted more redundant code

5c76459

revert last commit

7b8a0fd

glemaitre merged commit f82d966 into scikit-learn:master Jul 19, 2019

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Jul 24, 2019

FIX plot_partial_dependence not taking target into account when multi…

2f662b4

…class (scikit-learn#14393)

Uh oh!

[MRG+1] fix plot_partial_dependence not taking target into account when multiclass #14393

[MRG+1] fix plot_partial_dependence not taking target into account when multiclass #14393

Uh oh!

Conversation

GuillemGSubies commented Jul 17, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

amueller commented Jul 17, 2019

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

glemaitre commented Jul 18, 2019

Uh oh!

glemaitre commented Jul 18, 2019

Uh oh!

GuillemGSubies commented Jul 18, 2019

Uh oh!

glemaitre commented Jul 18, 2019

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GuillemGSubies Jul 18, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 18, 2019

Choose a reason for hiding this comment

Uh oh!

GuillemGSubies Jul 18, 2019

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 18, 2019

Choose a reason for hiding this comment

Uh oh!

GuillemGSubies Jul 18, 2019

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Jul 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

NicolasHug Jul 18, 2019 •

edited

Loading