Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@adam2392
Copy link
Member

@adam2392 adam2392 commented Feb 28, 2024

Reference Issues/PRs

Fixes: #16153
Fixes: #17184

What does this implement/fix? Explain your changes.

  • Adds True/False labels on top of the arrows that go from root node to left/right child
  • Updates unit-test

Any other comments?

The images from `plot_tree`
(default):

image

(figsize=(10,10)):
image

(20, 20):
image

(5, 10):
image

@adam2392
Copy link
Member Author

adam2392 commented Feb 28, 2024

import numpy as np
from matplotlib import pyplot as plt

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree

iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

clf = DecisionTreeClassifier(max_leaf_nodes=3, random_state=0)
clf.fit(X_train, y_train)

fig, ax = plt.subplots(figsize=(12, 12))
tree.plot_tree(
    clf,
    feature_names=iris.feature_names,
    class_names=iris.target_names,
    filled=True,
    ax=ax,
)
plt.show()

Signed-off-by: Adam Li <[email protected]>
@github-actions
Copy link

github-actions bot commented Feb 28, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 18876a7. Link to the linter CI: here

Signed-off-by: Adam Li <[email protected]>
text_pos,
ha="center",
va="center",
fontsize=10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be self.fontsize if it is not None?

How does this annotation scale when the figsize is large, (i.e. figsize=(20,20))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a refactoring of the kwargs, so that way it shouldn't be hardcoded here. I included in the PR description the image from various figure sizes.

@adam2392 adam2392 requested a review from thomasjpfan March 2, 2024 16:52
Copy link
Contributor

@Charlie-XIAO Charlie-XIAO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rendering looks nice 😊 Here are some minor comments that probably makes the code cleaner.

adam2392 and others added 2 commits March 2, 2024 12:47
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Member Author

adam2392 commented Mar 2, 2024

Thanks for the review @thomasjpfan and @Charlie-XIAO !

I addressed your comments

@adam2392 adam2392 requested a review from Charlie-XIAO March 2, 2024 17:52
Copy link
Contributor

@Charlie-XIAO Charlie-XIAO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Here are some additional suggestions regarding the positions of the True/False labels when I read through the diff again. I'm not sure if my suggested solution would work, so you may need to try it out :)

Also it seems that Codecov is complaining because we do not have a test that uses the fontsize parameter. It is not originally caused by this PR but maybe you can add a test to cover that.

adam2392 added 2 commits March 3, 2024 09:47
Signed-off-by: Adam Li <[email protected]>
@adam2392
Copy link
Member Author

adam2392 commented Mar 3, 2024

Also it seems that Codecov is complaining because we do not have a test that uses the fontsize parameter. It is not originally caused by this PR but maybe you can add a test to cover that.

Added a unit-test for fantasize

Copy link
Member Author

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @Charlie-XIAO ! Now the figsize=(20,8) works well.

Copy link
Contributor

@Charlie-XIAO Charlie-XIAO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adam2392! The plots look good to me now :)

Comment on lines +736 to +739
if node.parent.left() == node:
label_text, label_ha = ("True ", "right")
else:
label_text, label_ha = (" False", "left")
Copy link
Contributor

@Charlie-XIAO Charlie-XIAO Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other reviewers' reference, the spaces before and after the True/False label are to create some offset from the arrow on top of adjusting horizontal alignment. I'm not sure if there is a nicer way to do this; note that we are also creating padding by using spaces in annotation text e.g. this line so maybe this is fine?

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Copy link

@d-kleine d-kleine Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Charlie-XIAO Currently, it looks like displaying the labels in the tree is "hardcoded". What do you think about providing this feature of displaying the labels for the first split in a decision tree with a parameter for plot_tree() in case anyone does not like to display them? Maybe as a param called display_labels like in sklearn.metrics.ConfusionMatrixDisplay

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable to me, but I'll wait to see what the dev team thinks if it's adding complexity to the API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I'm okay with the current implementation. Adding true and false labels is already improvement.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW resolves #17184

@Charlie-XIAO Charlie-XIAO added the Waiting for Second Reviewer First reviewer is done, need a second one! label Mar 5, 2024
xycoords="axes fraction",
bbox=self.bbox_args.copy(),
arrowprops=self.arrow_args.copy(),
**non_box_kwargs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If kwargs applies to annotations with a bounding box, then it's strange to have non_box_kwargs here.

If they share some kwargs, then I prefer this naming:

Suggested change
**non_box_kwargs,
**common_box_kwargs,

Copy link
Member Author

@adam2392 adam2392 Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to common_kwargs since the annotations true/false technically don't have a bounding box.

b90bcfe

Comment on lines +736 to +739
if node.parent.left() == node:
label_text, label_ha = ("True ", "right")
else:
label_text, label_ha = (" False", "left")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I'm okay with the current implementation. Adding true and false labels is already improvement.

@adam2392 adam2392 requested a review from thomasjpfan March 14, 2024 13:40
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan merged commit 8c9d0b2 into scikit-learn:main Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tree Waiting for Second Reviewer First reviewer is done, need a second one!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sklearn.tree.plot_tree - Adding path labels to decision tree plots Add threshold comparison above arrow in plot_tree

4 participants