Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[WIP] fix reinplacing bug #152011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: gh/zou3519/1162/base
Choose a base branch
from
Open

Conversation

zou3519
Copy link
Contributor

@zou3519 zou3519 commented Apr 23, 2025

Stack from ghstack (oldest at bottom):

There are two problems:

  1. canonicalize_view_scatter_ops adds some new nodes into the graph.
    These new nodes cause the alias info on the graph to be wrong. To fix
    this, we try to run FakeTensorUpdater on the graph again.
  2. FakeTensorUpdater's alias information is wrong. If the node was not
    previously seen, we need to recursively update users of the node,
    even if the meta["val"] looks like it is set correctly. The example
    is if we have x = foo(...); y = x.view(...). If the user replaces
    foo with a new bar node and sets bar.meta["val"] correctly, then
    FakeTensorUpdater still needs to update y's meta["val"] to be a view
    of the new bar node.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

There are two problems:
1) canonicalize_view_scatter_ops adds some new nodes into the graph.
   These new nodes cause the alias info on the graph to be wrong. To fix
   this, we try to run FakeTensorUpdater on the graph again.
2) FakeTensorUpdater's alias information is wrong. If the node was not
   previously seen, we need to recursively update users of the node,
   even if the meta["val"] looks like it is set correctly. The example
   is if we have `x = foo(...); y = x.view(...)`. If the user replaces
   `foo` with a new `bar` node and sets bar.meta["val"] correctly, then
   FakeTensorUpdater still needs to update y's meta["val"] to be a view
   of the new bar node.

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Apr 23, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152011

Note: Links to docs will display an error until the docs builds have been completed.

❌ 64 New Failures, 1 Unrelated Failure

As of commit 390998c with merge base a40e876 (image):

NEW FAILURES - The following jobs have failed:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zou3519 added a commit that referenced this pull request Apr 23, 2025
There are two problems:
1) canonicalize_view_scatter_ops adds some new nodes into the graph.
   These new nodes cause the alias info on the graph to be wrong. To fix
   this, we try to run FakeTensorUpdater on the graph again.
2) FakeTensorUpdater's alias information is wrong. If the node was not
   previously seen, we need to recursively update users of the node,
   even if the meta["val"] looks like it is set correctly. The example
   is if we have `x = foo(...); y = x.view(...)`. If the user replaces
   `foo` with a new `bar` node and sets bar.meta["val"] correctly, then
   FakeTensorUpdater still needs to update y's meta["val"] to be a view
   of the new bar node.

ghstack-source-id: c0cdce7
Pull Request resolved: #152011
Copy link
Contributor

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@@ -754,5 +754,6 @@ def tensor_with_same_storage_already_reinplaced(arg):
def reinplace_inplaceable_ops(graph: torch.fx.Graph) -> None:
with enable_python_dispatcher():
canonicalize_view_scatter_ops(graph)
torch._inductor.fx_passes.post_grad.fake_tensor_updater.incremental_update()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem was:

  • canonicalize_view_scatter_ops does some find and replacing of nodes.
  • those new nodes have correct .meta[vals], but the users of said nodes have incorrect .meta[vals]
  • so we need to update said users via FakeTensorUpdater.

Question: I don't know how to pass the FakeTensorUpdater into this pass (the GraphTransformObserver stuff is difficult to work with) so I set it as a global variable. I assume that's not what we want.

Comment on lines +171 to +174
# if "val" in node.meta and is_fake_tensor_same(
# new_fake_tensor, node.meta["val"]
# ):
# continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deletion is only important for ensuring that the alias info in the FakeTensor vals is correct. Usually there doesn't need to be correct alias info (this doesn't matter when doing functional transform), but the reinplacing pass needs the correct alias info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant