Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@GuillemGSubies
Copy link
Contributor

Reference Issues/PRs

Fixes #14301

What does this implement/fix? Explain your changes.

I just took out an else so target_idx does not get overwritten.

Any other comments?

I didn't know what was the optimal way to test it. Right now I check the y axis and make sure that they are not the same (the bug made them equals all the time).

Also, I have a question: Here it should be int or str, shouldn't it?

target : int, optional (default=None)

@GuillemGSubies GuillemGSubies marked this pull request as ready for review July 17, 2019 14:36
@GuillemGSubies GuillemGSubies changed the title [WIP] plot_partial_dependence taking multiclass into account [MRG] plot_partial_dependence taking multiclass into account Jul 17, 2019
@GuillemGSubies GuillemGSubies changed the title [MRG] plot_partial_dependence taking multiclass into account [MRG] fix plot_partial_dependence not taking multiclass into account Jul 17, 2019
@GuillemGSubies GuillemGSubies changed the title [MRG] fix plot_partial_dependence not taking multiclass into account [MRG] fix plot_partial_dependence not taking targe into account when multiclass Jul 17, 2019
@GuillemGSubies GuillemGSubies changed the title [MRG] fix plot_partial_dependence not taking targe into account when multiclass [MRG] fix plot_partial_dependence not taking target into account when multiclass Jul 17, 2019
@amueller
Copy link
Member

cc @NicolasHug

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@glemaitre glemaitre added this to the 0.21.3 milestone Jul 18, 2019
@glemaitre
Copy link
Member

Also, I have a question: Here it should be int or str, shouldn't it?

If y contains str then estimator.classes_ should contain str as well so you should be right.

@glemaitre
Copy link
Member

You can add a test for it in fact (in another PR). Quickly the test should be something like:

from sklearn.datasets import fetch_openml
from sklearn.tree import DecisionTreeClassifier
from sklearn.inspection import plot_partial_depedence

iris = fetch_openml('iris', as_frame=True, version=1)
df, y = iris.data, iris.target.to_numpy()
clf = DecisionTreeClassifier().fit(df, y)
assert dtype(clf.classes_) == 'object'
# check that the pdp with str and int give the same results
# pick-up the last class
# implement the assert as in this PR
plot_partial_dependence(clf, df, [0], target='Iris-viriginica')
plot_partial_dependence(clf, df, [0], target=2)

@GuillemGSubies
Copy link
Contributor Author

You can add a test for it in fact (in another PR). Quickly the test should be something like:

from sklearn.datasets import fetch_openml
from sklearn.tree import DecisionTreeClassifier
from sklearn.inspection import plot_partial_depedence

iris = fetch_openml('iris', as_frame=True, version=1)
df, y = iris.data, iris.target.to_numpy()
clf = DecisionTreeClassifier().fit(df, y)
assert dtype(clf.classes_) == 'object'
# check that the pdp with str and int give the same results
# pick-up the last class
# implement the assert as in this PR
plot_partial_dependence(clf, df, [0], target='Iris-viriginica')
plot_partial_dependence(clf, df, [0], target=2)

Actually I asked because there is already a test about that

@glemaitre
Copy link
Member

Actually I asked because there is already a test about that

Oh perfect then, so no need for an additional test ;)

@glemaitre glemaitre changed the title [MRG] fix plot_partial_dependence not taking target into account when multiclass [MRG+1] fix plot_partial_dependence not taking target into account when multiclass Jul 18, 2019
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments but LGTM anywway.

Thanks for the fix @GuillemGSubies !

# check that the pd plots are the same for 0 and "setosa"
assert all(axs[0].lines[0]._y == axs2[0].lines[0]._y)
# check that the pd plots are different for another target
clf = GradientBoostingClassifier(n_estimators=10, random_state=1)
Copy link
Member

@NicolasHug NicolasHug Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove a few lines, namely the clf definition and fitting, as well as the grid_resolution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will. I did not change those because I did not know if it had to be with some standard you use when testing.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments but LGTM anywway.

Thanks for the fix @GuillemGSubies !

# check that the pd plots are the same for 0 and "setosa"
assert all(axs[0].lines[0]._y == axs2[0].lines[0]._y)
# check that the pd plots are different for another target
clf.fit(iris.data, iris.target)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this line too ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I shouldn't have removed it. That means that if I train using the targets as strings, I cannot pass an int to plot_partial_dependence
Don't know if that is the expected behavior or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK, my bad, I didn't realize it was fit on something different before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted

@glemaitre glemaitre merged commit f82d966 into scikit-learn:master Jul 19, 2019
@glemaitre
Copy link
Member

Thanks @GuillemGSubies

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug on partial dependence plot for multiclass classifiers. target_idx is always rewritten.

4 participants