Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gh-96151: Use a private name for passing builtins to dataclass #98143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 31, 2022

Conversation

hauntsaninja
Copy link
Contributor

@hauntsaninja hauntsaninja commented Oct 10, 2022

There's no indication that BUILTINS is a special name. Other names that are special to dataclass are all prefixed by an underscore.

As mentioned in the issue, we can also avoid this locals dance altogether by using ().__class__.__base__ instead of BUILTINS.object.

There's no indication that BUILTINS is a special name. Other names that
are special to dataclass are all prefixed by an underscore.

As mentioned in the issue, we can also avoid this locals dance
altogether by using `().__class__.__base__` instead of
`BUILTINS.object`.
Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think such subtile change needs a test case :)

@ericvsmith ericvsmith self-assigned this Oct 10, 2022
self.assertEqual(c.object, 'foo')
self.assertEqual(c.BUILTINS, 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a separate function for this, and don't mix it in with existing tests. Maybe test_field_named_BUILTINS or similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, made the change :-)

@@ -431,8 +431,8 @@ def _create_fn(name, args, body, *, globals=None, locals=None,
# worries about external callers.
if locals is None:
locals = {}
if 'BUILTINS' not in locals:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest using __BUILTINS__ because dunder names are supposed to be reserved for use by the stdlib.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I commit this, I'd like to spend some time researching why this test is even present, instead of just unconditionally assigning to locals. At the very least it could use a comment.

Also, I'm not sure that exposing all of builtins in locals is a good idea, versus just exposing builtins.object.

Copy link
Contributor Author

@hauntsaninja hauntsaninja Oct 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, looks like the relevant history is:

That is, I think this check was made dead in #9518, but wasn't noticed in that PR

(Also note that the comment at the top of _create_fn is out of date: we do mutate locals, but not via exec)

Copy link
Contributor Author

@hauntsaninja hauntsaninja Oct 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do some double checking (locals should all be created within dataclasses.py) and clean that up + change the PR to only pass along object.

(I'll also note that there's still the inscrutable ().__class__.__base__ option on the table, in case we don't want to expose anything at all)

Copy link
Contributor Author

@hauntsaninja hauntsaninja Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I audited all the call sites, it looks like there is one case where this check is not dead, but it's accidental.

Over here we reuse the same locals dict for two different _create_fn calls:

locals = {'cls': cls,

so the second time round we already have the entry for builtins in the dict. shrug

So my conclusion is:
a) It's safe to remove the check.
b) We should actually go a little further. Since we only need builtins for the frozen init, we should pass that in explicitly when creating __init__ and remove this from _create_fn

I've gone ahead and pushed this change to the PR

@hauntsaninja
Copy link
Contributor Author

@JelleZijlstra I made the change, but note that if single underscore names are considered fair game for users, you can elicit all kinds of bad behaviour from dataclass. For example:

from dataclasses import dataclass, field

@dataclass
class X:
    x: int = field(default_factory=lambda: 111)
    _dflt_x: int = field(default_factory=lambda: 222)

X()

I'm happy to open an issue and fix all of these as well, if you think it's worth doing.

@TeamSpen210
Copy link

If we’re only actually using setattr, since __init__ is probably a rather hot bit of code would it be a bit more efficient to bind the __setattr__ method itself in the scope? Then there’s less lookups for each attribute being set.

@hauntsaninja
Copy link
Contributor Author

Maybe? I heard a rumour that obj.method(...) is faster these days than m = obj.method; m(...). I'll do some benchmarking and if I get good results putting object or object.__setattr__ into scope, I'll open an issue.

@ericvsmith
Copy link
Member

About single underscores:

I'm happy to open an issue and fix all of these as well, if you think it's worth doing.

That's worth opening an issue for.

@hauntsaninja
Copy link
Contributor Author

Thanks, I opened #98886 for the single underscores.

Meanwhile, I also modified this PR to use a slightly different name, since my suggestion in #98886 is to have all these special names prefixed with __dataclass_ (like we already do for __dataclass_self__). I think this PR should be good to go.

@ericvsmith ericvsmith added 3.11 only security fixes 3.10 only security fixes labels Oct 31, 2022
@ericvsmith
Copy link
Member

This all looks great. I'm not sure if this will backport cleanly, but I've added the tags for 3.10 and 3.11 backports I see this as a bug fix that should be backported, at least to 3.11.

@ericvsmith ericvsmith merged commit 29f98b4 into python:main Oct 31, 2022
@AlexWaygood AlexWaygood removed the 3.11 only security fixes label Oct 31, 2022
@AlexWaygood AlexWaygood added needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes and removed 3.10 only security fixes labels Oct 31, 2022
@miss-islington
Copy link
Contributor

Thanks @hauntsaninja for the PR, and @ericvsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @hauntsaninja for the PR, and @ericvsmith for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 31, 2022
… This now allows for a field named BUILTIN (pythongh-98143)

(cherry picked from commit 29f98b4)

Co-authored-by: Shantanu <[email protected]>
@bedevere-bot
Copy link

GH-98899 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Oct 31, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 31, 2022
… This now allows for a field named BUILTIN (pythongh-98143)

(cherry picked from commit 29f98b4)

Co-authored-by: Shantanu <[email protected]>
@bedevere-bot
Copy link

GH-98900 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Oct 31, 2022
ericvsmith pushed a commit that referenced this pull request Oct 31, 2022
…. This now allows for a field named BUILTIN (gh-98143) (gh-98899)

gh-96151: Use a private name for passing builtins to dataclass. This now allows for a field named BUILTIN (gh-98143)
(cherry picked from commit 29f98b4)

Co-authored-by: Shantanu <[email protected]>

Co-authored-by: Shantanu <[email protected]>
ericvsmith pushed a commit that referenced this pull request Oct 31, 2022
…. This now allows for a field named BUILTIN (gh-98143) (gh-98900)

gh-96151: Use a private name for passing builtins to dataclass. This now allows for a field named BUILTIN (gh-98143)
(cherry picked from commit 29f98b4)

Co-authored-by: Shantanu <[email protected]>

Co-authored-by: Shantanu <[email protected]>
@hauntsaninja hauntsaninja deleted the gh-96151 branch November 21, 2022 06:59
hauntsaninja added a commit to hauntsaninja/cpython that referenced this pull request Feb 18, 2023
This commit prefixes `__dataclass` to several things in the locals dict:
- Names like _dflt_ (which cause trouble, see first test)
- Names like _type_ (not known to be able to cause trouble)
- _return_type (not known to able to cause trouble)
- _HAS_DEFAULT_FACTORY (which causes trouble, see second test)

In addition, this removes `MISSING` from the locals dict. As far as I
can tell, this wasn't needed even in the initial implementation of
dataclasses.py (and tests on that version passed with it removed).

This is basically a continuation of python#96151, where fixing this was
welcomed in python#98143 (comment)
hauntsaninja added a commit that referenced this pull request Mar 25, 2023
…mes (#102032)

This commit prefixes `__dataclass` to several things in the locals dict:
- Names like `_dflt_` (which cause trouble, see first test)
- Names like `_type_` (not known to be able to cause trouble)
- `_return_type` (not known to able to cause trouble)
- `_HAS_DEFAULT_FACTORY` (which causes trouble, see second test)

In addition, this removes `MISSING` from the locals dict. As far as I can tell, this wasn't needed even in the initial implementation of dataclasses.py (and tests on that version passed with it removed). This makes me wary :-)

This is basically a continuation of #96151, where fixing this was welcomed in #98143 (comment)
Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this pull request Mar 27, 2023
…ore names (python#102032)

This commit prefixes `__dataclass` to several things in the locals dict:
- Names like `_dflt_` (which cause trouble, see first test)
- Names like `_type_` (not known to be able to cause trouble)
- `_return_type` (not known to able to cause trouble)
- `_HAS_DEFAULT_FACTORY` (which causes trouble, see second test)

In addition, this removes `MISSING` from the locals dict. As far as I can tell, this wasn't needed even in the initial implementation of dataclasses.py (and tests on that version passed with it removed). This makes me wary :-)

This is basically a continuation of python#96151, where fixing this was welcomed in python#98143 (comment)
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
…ore names (python#102032)

This commit prefixes `__dataclass` to several things in the locals dict:
- Names like `_dflt_` (which cause trouble, see first test)
- Names like `_type_` (not known to be able to cause trouble)
- `_return_type` (not known to able to cause trouble)
- `_HAS_DEFAULT_FACTORY` (which causes trouble, see second test)

In addition, this removes `MISSING` from the locals dict. As far as I can tell, this wasn't needed even in the initial implementation of dataclasses.py (and tests on that version passed with it removed). This makes me wary :-)

This is basically a continuation of python#96151, where fixing this was welcomed in python#98143 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants