-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
FIX Fix free-threaded failure because dictionary changed size during iteration #32264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Copy is needed with free-threaded context to avoid | ||
# RuntimeError: dictionary changed size during iteration. | ||
# copy.deepcopy applied on an instance of base_class adds | ||
# __slotnames__ attribute to base_class. | ||
base_class_items = vars(base_class).copy().items() | ||
for attr, value in base_class_items: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where's this copy coming from? Where do we have the slot names (other than tags)?
Also, this only lowers the probability of encountering the issue I think, since you can still have an issue in the middle of vars(base_class).copy()
?
I'd like to understand where the issue actually comes from. __slotnames__
seems to be just there:
>>> class Test:
... pass
...
>>> a = Test()
>>> a.b = 10
>>> import copy
>>> dir(copy.deepcopy(a))
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__firstlineno__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slotnames__', '__static_attributes__', '__str__', '__subclasshook__', '__weakref__', 'b']
>>> dir(a)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__firstlineno__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slotnames__', '__static_attributes__', '__str__', '__subclasshook__', '__weakref__', 'b']
and not necessarily added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To try to make my original statement more precise: __slotnames__
is not present after the class definition but is added when you do a copy
(or deepcopy
).
import copy
class A: pass
print(f'after class definition: {"__slotnames__" in vars(A)=}')
copy.copy(A())
print(f'after copy: {"__slotnames__" in vars(A)=}')
Output:
after class definition: "__slotnames__" in vars(A)=False
after copy: "__slotnames__" in vars(A)=True
Here is the stack-trace that shows where the addition of the __slotnames__
attribute (through deepcopy
) comes from:
-> self._bootstrap_inner()
/home/lesteve/micromamba/envs/py314t/lib/python3.14t/threading.py(1081)_bootstrap_inner()
-> self._context.run(self.run)
/home/lesteve/micromamba/envs/py314t/lib/python3.14t/threading.py(1023)run()
-> self._target(*self._args, **self._kwargs)
/home/lesteve/micromamba/envs/py314t/lib/python3.14t/site-packages/pytest_run_parallel/plugin.py(60)closure()
-> fn(*args, **kwargs)
/home/lesteve/micromamba/envs/py314t/lib/python3.14t/contextlib.py(85)inner()
-> return func(*args, **kwds)
/home/lesteve/dev/scikit-learn/sklearn/tests/test_pipeline.py(2234)test_metadata_routing_for_pipeline()
-> est = set_request(est, "fit", sample_weight=True, prop=True)
/home/lesteve/dev/scikit-learn/sklearn/tests/test_pipeline.py(2225)set_request()
-> getattr(est, f"set_{method}_request")(**kwarg)
/home/lesteve/dev/scikit-learn/sklearn/utils/_metadata_requests.py(1352)func()
-> requests = _instance._get_metadata_request()
/home/lesteve/dev/scikit-learn/sklearn/utils/_metadata_requests.py(1530)_get_metadata_request()
-> requests = get_routing_for_object(self._metadata_request)
/home/lesteve/dev/scikit-learn/sklearn/utils/_metadata_requests.py(1218)get_routing_for_object()
-> return deepcopy(obj)
/home/lesteve/micromamba/envs/py314t/lib/python3.14t/copy.py(146)deepcopy()
-> rv = reductor(4)
> /home/lesteve/micromamba/envs/py314t/lib/python3.14t/copyreg.py(160)_slotnames()
Fix #32087
This was seen in #32087 in different test functions but the problematic code is the same in
sklearn/utils/_metadata_requests.py
.I can reproduce locally with the following (it fails ~5-10 times out of 20 on my machine on
main
):Failure details
Here is my current understanding of the problem:
__slotnames__
attribute has been added__slotnames__
attribute is added to a class whencopy.deepcopy
is called on an instance of this class and the metadata routing code is usingcopy.deepcopy
.__metadata_request__
. We don't care whether__slotnames__
has been added or not.If we want to avoid doing a copy, I guess another option would be to use a lock lock to ensure that the
for
loop and thecopy.deepcopy
can not happen at the same time but it seems a bit more complicated than making the copy.