Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Do not cover up __dunder__ method type-hints from .pyi file #150875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alanhdu
Copy link
Contributor

@alanhdu alanhdu commented Apr 8, 2025

In the build system, we generate a torch/_C/__init__.pyi that contains typehints of the base TensorBase that torch.Tensor inherits from. That contains a bunch of type-annotations for annotating these dunder methods.

Unfortunately, by defining them here, these are being automatically overwritten and "hidden", leading to a bunch of confusing type-errors like

def inv(x: torch.Tensor):
    # Unsupported operand [58]: `/` is not supported for operand types `int` and `torch._tensor.Tensor`.
    1 / x

This modifies the code to use the runtime behavior of these functions but to fall back on the .pyi annotations at type-checking time.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @ezyang @malfet @xuzhao9 @gramster

Copy link

pytorch-bot bot commented Apr 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150875

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (2 Unrelated Failures)

As of commit 39f0fd1 with merge base f47bf38 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@alanhdu alanhdu added the module: typing Related to mypy type annotations label Apr 22, 2025
torch/_tensor.py Outdated
Comment on lines 1103 to 1100
__pos__ = _C.TensorBase.positive
__neg__ = _C.TensorBase.neg
__abs__ = _C.TensorBase.abs

@_handle_torch_function_and_wrap_type_error_to_not_implemented
def __floordiv__(self, other):
return torch.floor_divide(self, other)
# The typehints for these dunder methods are auto-generated as part of
# _C.TensorBase's typestubs, so use those.
if not TYPE_CHECKING:

@_handle_torch_function_and_wrap_type_error_to_not_implemented
def __rfloordiv__(self, other):
return torch.floor_divide(other, self)
@_handle_torch_function_and_wrap_type_error_to_not_implemented
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why some things need to be inside the TYPE_CHECKING block and what thigns don't need to be (and I'm not entirely sure how these tests the torch/_C/__init__.pyi file is actually generated from the __init__.pyi.in file), but this combination seems to make hte tests pass.

@alanhdu
Copy link
Contributor Author

alanhdu commented Apr 23, 2025

Hm... it looks like there are a bunch of mypy failures because more things are resolving to Tensor, but since they are scalar Tensors at runtime they can be treated as int/float/whatever.

I'd be interested to know what the guidance here should be -- should I insert .item() calls in the places? Update the funtion signatures to also allow Tensor inputs? Or just add # type: ignore comments to the right places?

@colesbury colesbury requested review from Skylion007 and rec April 23, 2025 18:31
@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 23, 2025
@rec rec added the topic: not user facing topic category label Apr 24, 2025
@rec
Copy link
Collaborator

rec commented Apr 24, 2025

Hello, and thanks for doing this!

It's a great idea, and will fix most of #145838

I tried to do this earlier in a different way, and it got merged, but it ran into issues downstream with projects like torchrec and executorch and it got reverted, and I could not debug it locally.

This might well work, it's a less invasive.


Number one thing - it's hard to read the diffs, and this is a pretty sensitive file. Could I convince you to re-order the method definitions in _tensor.py so they are in the order same as before, so it's easy to compare the old and new versions of each method?

Secondarily, in answer to your question, I think that using .item() is the best idea when using a scalar tensor as a plain old Python number.

The documentation on item() tells you just what is happening, it will raise an exception if the tensor is not scalar (you can easily imagine doing a lot of pointless computations on some matrix thinking it was a scalar), and the overhead is not measurable.

Excited to see how this goes!

@alanhdu alanhdu requested review from albanD and janeyx99 as code owners April 29, 2025 19:47
@pytorch-bot pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Apr 29, 2025
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the non-type changes here are not ok. You should NOT change existing logic.

@alanhdu
Copy link
Contributor Author

alanhdu commented Apr 29, 2025

@albanD

Going to my original question:

I'd be interested to know what the guidance here should be -- should I insert .item() calls in the places? Update the funtion signatures to also allow Tensor inputs? Or just add # type: ignore comments to the right places?

Would you prefer that I just use # type: ignore comments here instead then? Happy to do that instead of inserting item and None checks.

@rec
Copy link
Collaborator

rec commented Apr 30, 2025

Well, I was going from a previous code review adding typing where I had the same issue - a scalar tensor being used as a float - and the reviewer suggested .item() because it makes all the types correct, and when I thought about it, I agreed.

I myself interpreted the "logic changes" to refer to the if not TYPE_CHECKING: lines within the class definition. Now that the diffs line up better, I do agree that it's a bit weird.

It's... unfortunate that my previous attempt got reverted because it broke downstream products like executorch, but I wasn't given a traceback I could use, and I couldn't manage to get executorch to build with the commit ID (we had this issue before as well, apparently this month's executorch build is much easier to get working though).

@albanD
Copy link
Collaborator

albanD commented Apr 30, 2025

Ho sorry @alanhdu I didn't realize there was a lot of discussion here and just looked at the diff.

Doing things like .item(), wrapping numbers into Tensors, etc have very subtle implications that need very very careful review.
If your goal is to fix these type annotations, I would stay away from this kind of changes for sure. And if you absolutely need one of them, I would do a single PR with just that change so we can discuss it in details.
Checking the PR again now!

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change sounds ok in term of non-typing behavior now. Thanks for the update!

Given the amount of skips, should we expect many users doing type checking of their code will see errors after this?
In particular, a lot of the code where you added ignore actually work today. So it's typing that is too restrictive right?

assert_type(BOOL / TENSOR, Any)
assert_type(FLOAT / TENSOR, Any)
assert_type(INT / TENSOR, Any)
assert_type(BOOL // TENSOR, Tensor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment above can be removed now that these are fixed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment still needs to be there, because the __rmod__, __rlshift__ and __rrshift__ still turn to Any (e.g. INT % TENSOR). I haven't figured out exactly why their behavior is different when I try to move the implementations into the if not TYPE_CHECKING block (if I move them, then they resolve to int isntead of Tensor for some reason...)

@alanhdu
Copy link
Contributor Author

alanhdu commented Apr 30, 2025

Given the amount of skips, should we expect many users doing type checking of their code will see errors after this?
In particular, a lot of the code where you added ignore actually work today. So it's typing that is too restrictive right?

Yeah, there will probably be some user type errors downstream. Of the # type: ignores I added, I think there are two major categories:

  • Places where the was a pre-existing mypy error that was hidden behind the Any (mostly places where mypy couldn't "see through" the optional check to distinguish Tensor | None from Tensor).
  • Places where the code is type-checked to take a number (e.g. int or float) and the runtime type is a (scalar) Tensor.

@rec
Copy link
Collaborator

rec commented May 1, 2025

Help for users in the release notes?

I think we should help out your average-practitioner end user who gets new type-checking errors in an existing code base that "already works" - by giving them some help in the release notes.

A typical example:

a: Tensor
b: Tensor
c = a // b

Before this change, mypy would not know the type of c, but at runtime, c would always in fact be a Tensor.

Consider a new type error in the user's system coming from this one line: somewhere else in the end user code, type checking finds that some variable v, either c or some variable computed from c, has a type different from the required annotated type at a call site (which includes attribute assignment, etc)

Some possibilities:

  1. v is a scalar tensor and passed to a function that's expecting a scalar number
  2. v is passed to some function that expects a fairly similar class to Tensor, like np.darray
  3. There is some other typing error in their system
  4. All the user typing is right: the code with the error would do something wrong if executed

Case 1 might will work "every time" (as long as you are "sure" that the result is always a scalar tensor) and it's something we do sometimes in the pytorch codebase. Calling Tensor.item() at that point is still probably the right thing to do for end users, because it has the advantage that it will immediately fail if v is not, in fact, a scalar tensor, and then the real type at runtime will be the same as the deduced type at type checking time.

Case 2 is a latent trap even if it works. They could use the Python Array API instead, or construct a new instance of the correct class, or their own Protocol type.

Case 3 is a catchall, there might not be much to say.

Case 4 is the very reason we have typing, to find wrong code.

@alanhdu
Copy link
Contributor Author

alanhdu commented May 5, 2025

I think we should help out your average-practitioner end user who gets new type-checking errors in an existing code base that "already works" - by giving them some help in the release notes.

I think that's fair. Agreed that some guidance in the release notes might make sense. Is that something I need to do in this PR? I checked the CONTRIBUTING.md, but didn't see instructions on how release notes work.

I agree that case 1 and case 2 are the most likely (since they are places where at runtime things will generally work out). I agree that having explicit casts (e.g. .item() or .numpy()) are useful, although maybe another option is to update the consuming function to use protocols (e.g. SupportsInt or SupportsFloat) where necessary.

@rec
Copy link
Collaborator

rec commented May 6, 2025

To be honest, I actually have no idea how the release notes are prepared, but I know they aren't the responsibility of the person making the pull request!

I figured I'd leave notes here in case they were useful to someone.

alanhdu added 3 commits May 6, 2025 11:01
In the build system, we generate a `torch/_C/__init__.pyi` that contains
typehints of the base `TensorBase` that  `torch.Tensor` inherits from.
That contains a bunch of type-annotations for annotating these dunder
methods.

Unfortunately, by defining them here, these are being automatically
overwritten and "hidden", leading to a bunch of confusing type-errors
like

```python
def inv(x: torch.Tensor):
    # Unsupported operand [58]: `/` is not supported for operand types `int` and `torch._tensor.Tensor`.
    1 / x
```

This modifies the code to use the *runtime* behavior of these functions
but to fall back on the `.pyi` annotations at type-checking time.
@rec rec added the release notes: torch.func release notes category for torch.vmap or torch.func.* APIs label May 7, 2025
@rec
Copy link
Collaborator

rec commented May 7, 2025

I added a wrong tag, release notes: torch.func to this, because I couldn't find one that said: release notes: typing or : tensor (general) or something...

At least someone will see it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor module: typing Related to mypy type annotations oncall: distributed Add this issue/PR to distributed oncall triage queue open source release notes: torch.func release notes category for torch.vmap or torch.func.* APIs topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants