Pointwise Tag, use in aot_autograd partitioner #90029

eellison · 2022-12-01T23:42:36Z

Stack from ghstack (oldest at bottom):

-> Pointwise Tag, use in aot_autograd partitioner #90029

Takes the pointwise op list from DTensor as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner.

Edit: expanded to cover all ops in Unary, Binary, Ternary, and TensorCompare.

…partitioner [ghstack-poisoned]

pytorch-bot · 2022-12-01T23:42:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90029

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures, 3 Pending

As of commit fe63c54:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…partitioner ghstack-source-id: 0b87b72 Pull Request resolved: #90029

aten/src/ATen/native/tags.yaml

wanchaol · 2022-12-02T05:04:40Z

functorch/_src/partitioners.py

@@ -346,6 +365,9 @@ def is_tensor_node(x):
        # Natalia said that we should allow recomputing indexing :)
        default_recomputable_ops += [aten.index]

+        # add more generally ?
+        default_recomputable_ops += pointwise_ops()


I recalled the DTensor pointwise op set might not be a complete set of all the pointwise operators in ATen (although it covers most of them). Do you need to include all possible pointwise ops initially, or we can increase the coverage step by step?

Does dtensor need the narrower set here?

Yeah possibly DTensor needs a narrower set as some of the pointwise op might not make too much sense to DTensor, but I would prefer the tag in native_functions to reflect all pointwise ops, and leave the subsystems (i.e. DTensor or aot autograd) to decide and filter out the ops needed from the tagged set.

aten/src/ATen/native/native_functions.yaml

ezyang · 2022-12-02T15:17:34Z

aten/src/ATen/native/tags.yaml

+  desc: |
+          Pointwise operators are operators where each element of the output is computed only by accessing
+          the corresponding element of all the broadcasted inputs. The output shape will be the broadcasted
+          shape of the inputs.


Does pointwise imply that it is implemented with TensorIterator?

Pointwise is more a description of how an operator is computed than its underlying implementation. I would imagine all pointwise operators should be computed with TensorIterator but I don't think that's a necessary condition. Open to input though.

The main reason I ask is because there is a bunch of subtle striding behavior, which is unlikely to be done correctly if you're not using TensorIterator under the hood. So, it matters materially for the more subtle invariants whether or not you're buying into "all of the TensorIterator semantics". We don't have to call it TI but I am interested in knowing if we are giving these guarantees.

I don't think so. although would be interested to hear from @wanchaol and @Chillee, who have essentially been maintaining their own pointwise op set.

To add some context about the DTensor pointwise op set, it was initially added from manual inspection of every op in the native_functions.yaml by referencing https://pytorch.org/docs/stable/torch.html#pointwise-ops, then added some missing pointwise ops when we actually tried on real models. I believe this set is still not completely including all possible pointwise ops yet (but it's close).

For the meaning of pointwise ops, I think the description from @eellison should be a fair enough description of the definition of a pointwise op, but I agree we might need some formal algorithmic guard on it to ensure existing and newly added pointwise op get tagged too.

ezyang · 2022-12-02T15:22:41Z

functorch/_src/partitioners.py

+                ops.append(opoverloadpacket)
+                break
+
+    return ops


If at all possible, we should figure out a way to do this that doesn't involve glomming all of the operators into an LRU cache. The main hazard to doing it this way is that torch.ops.aten is lazily loaded, so there isn't a guarantee that you will actually have all of the operators at the time you call this function. And this will be even worse if you want to support pointwise ops outside of the aten namespace

hmmm, what would you suggest ? I think it would make sense for there to exist a database that maps from tag to corresponding operators. cc @anjali411.

That could be lazily loaded as well, but the invocation here would force loading.

yeah that's a good idea. I think we can just create it in codegen when the native functions are parsed.

Currently, there's no API for users to add tags from Python Library API but in the future, we'll have to ensure that this db is up-to-date when new ops are added.

I mentioned this in WC but to repeat it here: for use in the partitioner, you don't need the entire set, at the point where you need to test if an op is pointwise or not, just check the tag directly.

ezyang · 2022-12-02T15:24:58Z

On my end, I need a more clarity on what exactly pointwise means, and when it is permissible to add an operator to this set. "We just used the same list from DTensor" does it cut it in the long term.

eellison · 2022-12-02T16:46:24Z

@ezyang

I took from the DTensor supported list initially so that it would be easier to remove that list moving forward if there were extra invariants in that list that are separate from the definition of pointwise ops, and so that you wouldn't have to individually review every operator I'm adding here.

I think :

Pointwise operators are operators where each element of the output is computed only by accessing
the corresponding element of all the broadcasted inputs. The output shape will be the broadcasted
shape of the inputs.

is a concise and adequate representation of pointwise. I'm happy to hear alternative definitions.

As it stands, the list in AOT Autograd was missing the following operators, which prevents efficient recomputation and subsequent speeding up of dynamo models.

aten::acosh
aten::addcdiv
aten::angle
aten::asinh
aten::bitwise_left_shift
aten::bitwise_right_shift
aten::bitwise_xor
aten::clip
aten::conj_physical
aten::deg2rad
aten::digamma
aten::erfinv
aten::exp2
aten::hypot
aten::i0
aten::igamma
aten::igammac
aten::logaddexp
aten::logaddexp2
aten::logical_not
aten::logical_xor
aten::logit
aten::masked_fill
aten::mvlgamma
aten::nan_to_num
aten::native_dropout_backward
aten::nextafter
aten::positive
aten::sgn
aten::sigmoid_backward
aten::signbit
aten::sinc
aten::square
aten::tanh_backward
aten::true_divide
aten::xlogy

ezyang · 2022-12-03T21:09:55Z

I guess my main concern is (1) how do we know this list as is, is correct, and (2) how do we make sure people keep this list up to date as new operators are added. In the original tags proposal, every tag was supposed to be accompanied with a test that could programmatically determine if an operator was in the set or not. There isn't any such test currently, which suggests that we're pretty unlikely to keep this list up to date going forward

wanchaol · 2022-12-05T07:25:27Z

I guess my main concern is (1) how do we know this list as is, is correct, and (2) how do we make sure people keep this list up to date as new operators are added. In the original tags proposal, every tag was supposed to be accompanied with a test that could programmatically determine if an operator was in the set or not. There isn't any such test currently, which suggests that we're pretty unlikely to keep this list up to date going forward

To ensure that the list is correct and keep the list up to date, I recalled I had a conversation with @albanD about this. Alban suggested that we can compute the vjp of the operator, and if the gradient is a diagonal matrix, then we are certain that this is a pointwise op. Maybe we can use this approach to guard the list?

ezyang · 2022-12-05T14:26:38Z

Checking that the vjp is diagonal will certainly let you test that there aren't functions that are incorrectly tagged this way. But do you really want to test the inverse (that a new function isn't missing the tag) this way? Seems a bit questionable.

albanD · 2022-12-05T14:49:53Z

tbh it is pretty hard to test programmatically here with the definition above. You will have to test every function, every sample for that function and every input value for these.
So whatever test we want to use (Jacobian, or eps difference on the input, etc) will be very expensive.

I think there are two options here:

Change the definition to be something not about the values of the function but the function itself. Such that we can check it without having to test the input/output mapping.
Add a new periodic validate_tags test that is going to be very very expensive but we don't run it often.

ezyang · 2022-12-05T16:56:27Z

I'm also ok with white box approaches; e.g., we grep the C++ source code or something and check it users TensorIterator, for example

eellison · 2022-12-05T17:29:55Z

@wanchaol how did you come up with this list originally ?

…t_autograd partitioner" Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner. [ghstack-poisoned]

eellison · 2022-12-05T19:11:12Z

Okay I added all of the functions in UnaryOps.cpp, BinaryOps.cpp, TensorCompare.cpp, TernaryOps.cpp, and I added a test that checks broadcasted shapes.

If people further thoughts on testing, I can do something more exhaustive that we agree on. One test that might make sense would be a DEBUG test that checks if when running TensorIterator that the most recently invoked dispatched operator includes the pointwise tag. This would be very expensive and we don't even have a DEBUG build right now but I think would work well.

ezyang · 2022-12-05T19:47:39Z

I think it is pretty exhaustive. But I also want to note that we have landed other tags that are not actually being used and are incomplete (e.g. canonical).

I agree canonical is problematic, but that's no excuse lol. (And pointwise is being used for much more substantive stuff in PyTorch than canonical right now, so let's hold it to a higher standard)

…partitioner ghstack-source-id: a414943 Pull Request resolved: #90029

ezyang · 2022-12-06T04:09:52Z

This doesn't look like it does the regex?

eellison · 2022-12-06T17:13:13Z

@ezyang did you mean as part of the build process ? I grepped for DEFINE_DISPATCH locally and then used that to tag all of the operators.

ezyang · 2022-12-06T17:29:36Z

I mean, have a unit test that runs whatever regex you did, and then check the tags match (both positively and negatively)

…t_autograd partitioner" Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner. [ghstack-poisoned]

ezyang · 2022-12-07T02:10:42Z

test/test_ops.py

+            "aten.mode.values",
+        )
+
+        regex = re.compile(r"DEFINE_DISPATCH\(.*_stub")


Hrrmmm, DEFINE_DISPATCH doesn't actually mean it uses TensorIterator, does it? lol

ezyang · 2022-12-07T02:12:14Z

test/test_ops.py

+            "aten/src/ATen/native/TensorCompare.cpp",
+        ]
+
+        allowed_functions = (


the name allowed here is confusing, what you actually mean is, manually denylisted for pointwise tag, don't you?

ezyang

Good enough for now. Thank you for humoring me.

…t_autograd partitioner" Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner. [ghstack-poisoned]

…partitioner ghstack-source-id: 983b047 Pull Request resolved: #90029

eellison · 2022-12-07T17:31:50Z

functorch/_src/partitioners.py

@@ -346,6 +365,9 @@ def is_tensor_node(x):
        # Natalia said that we should allow recomputing indexing :)
        default_recomputable_ops += [aten.index]

+        # add more generally ?
+        default_recomputable_ops += pointwise_ops()


We should probably also check that inductor has a lowering for it.. will update in another pr

…t_autograd partitioner" Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner. [ghstack-poisoned]

…partitioner ghstack-source-id: 7e51a8e Pull Request resolved: #90029

eellison · 2022-12-08T16:14:34Z

@pytorchbot merge

pytorchmergebot · 2022-12-08T16:18:51Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

bdhirsh · 2022-12-08T16:37:38Z

test/test_ops.py

+
+                    out_shape = torch._refs._broadcast_shapes(*shapes)
+
+                    for out_elem in tree_flatten(out):


do we actually have any pointwise ops that return more than 1 tensor?

oh maybe the _foreach_* ops fall in this category

pytorchmergebot · 2022-12-08T18:20:05Z

Merge failed

Reason: 2 additional jobs have failed, first few of them are: linux-binary-manywheel ,linux-binary-manywheel / manywheel-py3_7-cuda11_6-test / build

Details for Dev Infra team

Raised by workflow job

eellison · 2022-12-08T20:19:15Z

@pytorchbot merge -f "unrelated flakey failure"

pytorchmergebot · 2022-12-08T20:21:13Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…partitioner (pytorch#90029) Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner. Pull Request resolved: pytorch#90029 Approved by: https://github.com/ezyang

Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd …

64f6d8a

…partitioner [ghstack-poisoned]

eellison requested a review from anjali411 as a code owner December 1, 2022 23:42

eellison added a commit that referenced this pull request Dec 1, 2022

Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd …

c960ccb

…partitioner ghstack-source-id: 0b87b72 Pull Request resolved: #90029

github-actions bot added the ciflow/inductor label Dec 1, 2022

eellison added the topic: not user facing topic category label Dec 1, 2022

eellison commented Dec 1, 2022

View reviewed changes

aten/src/ATen/native/tags.yaml Show resolved Hide resolved

eellison requested review from ezyang, Chillee and SherlockNoMad December 1, 2022 23:48

wanchaol reviewed Dec 2, 2022

View reviewed changes

ezyang reviewed Dec 2, 2022

View reviewed changes

aten/src/ATen/native/native_functions.yaml Show resolved Hide resolved

ezyang reviewed Dec 2, 2022

View reviewed changes

eellison requested a review from ngimel December 2, 2022 15:36

This was referenced Dec 5, 2022

Add All Functions in UnaryOps.cpp #90195

Closed

Add all the ops in BinaryOps.cpp #90196

Closed

Add All the ops in TensorCompare #90199

Closed

eellison requested a review from mruberry as a code owner December 5, 2022 19:03

eellison added a commit that referenced this pull request Dec 6, 2022

Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd …

84434b6

…partitioner ghstack-source-id: a414943 Pull Request resolved: #90029

eellison requested a review from ezyang December 6, 2022 02:49

ezyang reviewed Dec 7, 2022

View reviewed changes

ezyang approved these changes Dec 7, 2022

View reviewed changes

eellison added a commit that referenced this pull request Dec 7, 2022

Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd …

6e37455

…partitioner ghstack-source-id: 983b047 Pull Request resolved: #90029

eellison commented Dec 7, 2022

View reviewed changes

eellison added a commit that referenced this pull request Dec 7, 2022

Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd …

63ebc57

…partitioner ghstack-source-id: 7e51a8e Pull Request resolved: #90029

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 8, 2022

bdhirsh reviewed Dec 8, 2022

View reviewed changes

SherlockNoMad mentioned this pull request Dec 8, 2022

Populate Canonical Aten Ops (Batch 2) #90456

Closed

pytorchmergebot added the Merged label Dec 8, 2022

pytorchmergebot closed this in b651e06 Dec 8, 2022

eellison changed the title ~~Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd partitioner~~ Pointwise Tag, use in aot_autograd partitioner Dec 12, 2022

facebook-github-bot deleted the gh/eellison/369/head branch June 8, 2023 16:18

eellison mentioned this pull request Jun 20, 2024

Expand Tag Set: views & reductions #129020

Open


		out_shape = torch._refs._broadcast_shapes(*shapes)

		for out_elem in tree_flatten(out):

Pointwise Tag, use in aot_autograd partitioner #90029

Pointwise Tag, use in aot_autograd partitioner #90029

Conversation

eellison commented Dec 1, 2022 • edited Loading

pytorch-bot bot commented Dec 1, 2022 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90029

❌ 2 Failures, 3 Pending

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Dec 2, 2022

eellison commented Dec 2, 2022

ezyang commented Dec 3, 2022

wanchaol commented Dec 5, 2022

ezyang commented Dec 5, 2022

albanD commented Dec 5, 2022

ezyang commented Dec 5, 2022

eellison commented Dec 5, 2022

eellison commented Dec 5, 2022 • edited Loading

ezyang commented Dec 5, 2022

ezyang commented Dec 6, 2022

eellison commented Dec 6, 2022

ezyang commented Dec 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang left a comment

Choose a reason for hiding this comment

eellison Dec 7, 2022 • edited Loading

Choose a reason for hiding this comment

eellison commented Dec 8, 2022

pytorchmergebot commented Dec 8, 2022

Merge started

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pytorchmergebot commented Dec 8, 2022

Merge failed

eellison commented Dec 8, 2022

pytorchmergebot commented Dec 8, 2022

Merge started

eellison commented Dec 1, 2022 •

edited

Loading

pytorch-bot bot commented Dec 1, 2022 •

edited

Loading

eellison commented Dec 5, 2022 •

edited

Loading

eellison Dec 7, 2022 •

edited

Loading