Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gh-129157: Change the location of with statement AST nodes to spawn the first line #129162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented Jan 21, 2025

| start='async' 'with' '(' a[asdl_withitem_seq*]=','.with_item+ ','? ')' end=':' b=block {
CHECK_VERSION(stmt_ty, 5, "Async with statements are", _PyAST_AsyncWith(a, b, NULL, EXTRA_EXPR(start, end))) }
| start='async' 'with' a[asdl_withitem_seq*]=','.with_item+ end=':' tc=[TYPE_COMMENT] b=block {
CHECK_VERSION(stmt_ty, 5, "Async with statements are", _PyAST_AsyncWith(a, b, NEW_TYPE_COMMENT(p, tc), EXTRA_EXPR(start, end))) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this will create a situation where the source location of a with node does not cover all the source code of the body sub-tree hanging under it. I think in the past we considered this undesirable.

If you think this is ok we can go with it, but I assumed it was not, and my intention was that we add a header child to a with node, so the tree has a with node spanning header + body, with two sub-trees, one for header and one for body. The other compound statements can also have a header + body split.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this will create a situation where the source location of a with node does not cover all the source code of the body sub-tree hanging under it. I think in the past we considered this undesirable.

If you think this is ok we can go with it, but I assumed it was not, and my intention was that we add a header child to a with node, so the tree has a with node spanning header + body, with two sub-trees, one for header and one for body. The other compound statements can also have a header + body split.

Ah, that is actually a breaking change for tools :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think this is ok we can go with it, but I assumed it was not, and my intention was that we add a header child to a with node, so the tree has a with node spanning header + body, with two sub-trees, one for header and one for body. The other compound statements can also have a header + body split.

Yeah, I can go with this instead 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, how different are you picturing this to be from the current version. Currently for

with A, B, C:
    ...

we have:

Module(
   body=[
      With(
         items=[
            withitem(
               context_expr=Name(id='A', ctx=Load())),
            withitem(
               context_expr=Name(id='B', ctx=Load())),
            withitem(
               context_expr=Name(id='C', ctx=Load()))],
         body=[
            Expr(
               value=Constant(value=Ellipsis))])])

are you picturing some new node wrapping the items?

Copy link
Member Author

@pablogsal pablogsal Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made a version of this just adding a "header" wrapper.

I am thinking that maybe we can generalize this and create a simple "container" node that just wraps and adds location info instead of making it "header". WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the location of the whole with statement, up to the :. So it would need a new node wrapping all the items.

But let's make sure we have a plan. How do we want to render a traceback for an exception coming out of the __enter__ or __exit__ of one of the context managers? Should we hilight the whole statement and put carets under the expression that creates the specific context manager that caused the error?

CC @ericsnowcurrently

@pablogsal
Copy link
Member Author

@iritkatriel I have switch to adding an extra optional header type to every statement. The reason is that wrapping is MUCH more verbose and require changes in everyone handling these nodes in Python as well which I would prefer to avoid. Additionally to create a generic one we would need to hack and to type erasing on the sequences which I don't like as that reduces safety.

I think this approach is the simplest but is also good enough to be generic.

I have only tacked with here for us to agree on the approach and we can do more statements in other PRs.

@iritkatriel
Copy link
Member

@iritkatriel I have switch to adding an extra optional header type to every statement. The reason is that wrapping is MUCH more verbose and require changes in everyone handling these nodes in Python as well which I would prefer to avoid. Additionally to create a generic one we would need to hack and to type erasing on the sequences which I don't like as that reduces safety.

I think this approach is the simplest but is also good enough to be generic.

Yes, I agree. Good idea.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're getting something, but I think this is not enough yet. Currently we get more precise information than what your change will give us - we know which of the context managers failed:

>>> class C:
...     def __init__(self, arg): self.arg = arg
...     def __enter__(self): 1/self.arg
...     def __exit__(*args): pass
...     
>>> with C(-1), C(0), C(1):
...     1
...     2
...     
Traceback (most recent call last):
  File "<python-input-21>", line 1, in <module>
    with C(-1), C(0), C(1):                                                      # <---
                ~^^^
  File "<python-input-20>", line 3, in __enter__
    def __enter__(self): 1/self.arg
                         ~^~~~~~~~~
ZeroDivisionError: division by zero

Unless we can add carets to show that, this change actually makes things worse in the traceback.

@@ -1693,4 +1693,4 @@ _PyPegen_concatenate_strings(Parser *p, asdl_expr_seq *strings,

assert(current_pos == n_elements);
return _PyAST_JoinedStr(values, lineno, col_offset, end_lineno, end_col_offset, p->arena);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do! In any case don't worry too much for now as we are just in the exploratory phase. Once we are happy I will clean everything

@pablogsal
Copy link
Member Author

Hummmmm how do you picture the carets then? Can you give me an example of what you have in mind?

I think it would also be helpful if you can draft how you would like to propagate the locations. Currently you have in the context manager the header but also every one of the individual items so in theory the compiler is given all it needs no?

@pablogsal
Copy link
Member Author

Another problem is to add carets we need to be able to parse the entire context manager with the body for the ast parse function to be happy we cannot just pass the header line because that will fail to parse

@iritkatriel
Copy link
Member

Currently you have in the context manager the header but also every one of the individual items so in theory the compiler is given all it needs no?

True. I guess this isn't the AST's problem then.

@iritkatriel
Copy link
Member

iritkatriel commented Jan 22, 2025

So we want something like this? (And analogously for except you hilight the relevant type expression (if there is one), and in a loop you hilight the iterator expression (for exception in __next__)).

>>> class C:
...     def __init__(self, arg): self.arg = arg
...     def __enter__(self): 1/self.arg
...     def __exit__(*args): pass
...     
>>> with C(-1), C(0), C(1):
...     1
...     2
...     
Traceback (most recent call last):
  File "<python-input-21>", line 1, in <module>
    with C(-1), C(0), C(1):                                                      # <---
    ~~~~~~~~~~~~^^^~~~~~~~
  File "<python-input-20>", line 3, in __enter__
    def __enter__(self): 1/self.arg
                         ~^~~~~~~~~
ZeroDivisionError: division by zero

@iritkatriel
Copy link
Member

iritkatriel commented Jan 23, 2025

Another problem is to add carets we need to be able to parse the entire context manager with the body for the ast parse function to be happy we cannot just pass the header line because that will fail to parse

We can add a pass as the body and then it will compile. But maybe we can find a way to do it without having to compile the code again.

Suppose the code object has locations as they are now (with just the context manager expression for the with statement), but there is also a "compound statement table" which maps instructions ranges to header locations. So all instructions implementing a compound statement header show up there (as one entry for the range of instructions). The space this would take would depend on the number of such blocks in the function (like the exception table grows when you have more try-excepts). Lookup time doesn't matter that much because it's only used for tracebacks.

@iritkatriel
Copy link
Member

I think you can revert the changes to the compiler and merge this PR for the AST changes. Then we'll follow up with something in the compiler for the tracebacks in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants