-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[Quant][PT2E]change flatten recipe for X86InductorQuantizer #136298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136298
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (4 Unrelated Failures)As of commit aed4b1d with merge base 0cdc6a8 ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
elif ( | ||
node.target is torch.ops.aten.flatten.using_ints | ||
and len(node.users) > 0 | ||
and not is_any_users_connected_to_quantizable_op(node.users) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not any([user for user in list(node.users.keys()) if user.target in quantizable_ops])
. since is_any_users_connected_to_quantizable_op
is only used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a comment for the special recipe of flatten
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the summary of this PR and the description of this special flatten quant recipe.
Thank you for your review. The code and description have been updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What model does this change help with?
ViT |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…136298) This PR modifies the flatten recipe: if none of the users of the flatten node are quantizable ops, int8 flatten will be disabled to avoid unnecessary dtype conversions. Pull Request resolved: pytorch#136298 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5
Stack from ghstack (oldest at bottom):
This PR modifies the flatten recipe: if none of the users of the flatten node are quantizable ops, int8 flatten will be disabled to avoid unnecessary dtype conversions.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang