-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-97912: Avoid quadratic behavior when adding LOAD_FAST_CHECK #97952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A benchmark: from time import perf_counter
from pathlib import Path
import sympy.integrals.rubi.rules.sine as sine
text = Path(sine.__file__).read_text("utf-8")
t0 = perf_counter()
for _ in range(10):
compile(text, "sine", "exec")
t1 = perf_counter()
print((t1 - t0) / 10) On my machine, this goes from about 0.96 seconds before this PR to 0.32 seconds after, both on windows without PGO. Benchmarking 3.11 (no PGO) in the same way, I also get roughly 0.32 seconds. |
The sympy file in the benchmark can be downloaded from I can confirm the benchmark goes much faster compared to Main: 1.9035797520901543 Benchmark at 63 locals looks like this: Main: 0.00048459595802705736 Benchmark at 64 locals looks like this: Main: 0.0004902609169948846 The patch is still faster in those cases. |
I could measure no compilation performance difference from adding
Before this PR:
Intermediate: without fast_scan_many_locals
After this PR, including fast_scan_many_locals: the same counts as before this PR. Since the |
🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit b07183c 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
@iritkatriel or @markshannon, are there any objections to adding |
No objections. I wondered if we can have this called from optimize_cfg (so that it can be accessed from unit tests). It would probably just require passing in the nparams and nlocals rather than the compiler (c). We can make that change later though. |
O(nlocals**2)
#97912