-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Break Artist._remove_method reference cycle #28861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break Artist._remove_method reference cycle #28861
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening your first PR into Matplotlib!
If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks.
You can also join us on gitter for real-time discussion.
For details on testing, writing docs, and our review process, please see the developer guide
We strive to be a welcoming and open project. Please follow our Code of Conduct.
Thank you for this. The number of loops we have is known problem that we are always open to improve! The reason for this bit of code is exactly this (to break the cycle between the Artist and its parents). I think we are doing it with the swap like this to reduce the amount of time are are in an invalid state (we say that we are not thread safe, but that does not mean we should intentionally drive to invalid states). The invalid state I'm worried about here is the case where there are artists in the children that have their parents cleared which will cause draws to fail. I also have a concern that there maybe other references to the list object floating around. I would propose a slight variation on this change where we keep the swap, but add |
Should we also set |
Yes please. |
Looks like the new test is failing on python v3.13. Relevant logs:
I wonder if the GC is running at the same time as |
I think you have found a CPython bug as you should not be able to segfault Python from Python. |
and it looks like the scatter got dropped from the test at some point. |
I still see |
I'll work on a minimal repro to submit to CPython then! What should we do with this test and this pull request while that's happening? |
I decided to skip the test on non-final versions of 3.13.0 because python/cpython#124538 was marked as a release blocker. So, we should be good to re-enable this test on 3.13.0-final and later. |
Co-authored-by: Elliott Sales de Andrade <[email protected]>
There are two failures on the most recent test run:
|
What are we waiting for to merge this? Is there anything I need to do? The only failing test (appVeyor) seems unrelated to my changes and I can see it failing on other PRs. Has it been fixed on main yet? Should I pull latest main again? |
Co-authored-by: Elliott Sales de Andrade <[email protected]>
PR summary
I think I've found a reference cycle that can consume quite a bit of memory. The size of the reachable bytes from the reference cycle seems to grow linearly with the number of points I've plotted and
Figure.clear()
does not remove them.gc.collect()
would remove this memory, but in performance critical applications, waiting for the GC is no fun.The reference cycle looks like this:
(credit to refcycle for helping me find it)
This PR breaks the cycle by emptying out the
Axes._children
list instead of replacing it with an empty list. This removes the reference from the list to the Artist, allowing the reference count to reach 0 on the Artist.I've found this method to be an effective workaround (without modifying matplotlib)
But I would like to contribute this memory usage improvement back into matplotlib. Though I'm not sure the best way. This PR is the smallest and simplest change, but I'm not certain if it covers enough similar reference cycles or just the one that showed up in my profiling. I also don't know if this is a clean way to solve the problem because I'm not very familiar with matplotlib internals. I considered making
Artist._remove_method
aweakref
but that seems like a larger change that I'm not ready for (yet?). How do you think we should approach this reference cycle?Note that this PR branches from a47e26b (not latest main) because of the issues I encountered with the next commit (See #28866 and 597554d#commitcomment-147024800).
Here is the benchmark script that I've been using to evaluate the effectiveness of these changes:
Running the benchmark before this change prints (showing that the workaround breaks the cycle)
And after (showing that the PR behaves similarly to the workaround)
PR checklist