Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gh-91719: Reload opcode on unknown error so that C can optimize the dispatching in ceval.c #94364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 30, 2022

Conversation

neonene
Copy link
Contributor

@neonene neonene commented Jun 28, 2022

This patch helps MSVC to optimize the switch code in _PyEval_EvalFrameDefault(), making the situation in which only the dispatcher reads the given opcode variable on non-debug builds.

Currently, each case loads an opcode from other places when needed, except unknown opcode case.

faster-cpython/ideas#422

@neonene
Copy link
Contributor Author

neonene commented Jun 29, 2022

My reports were based on fda4b2f (2022-06-22). 3.12 tip has reduced one stack access on PGO and Release(Ob3 only). With this patch, another access disappears.

3.11 seems a bit different on Ob3. I'm re-investigating 3.11 PGO now.

@gvanrossum
Copy link
Member

Okay, I'll wait before approving and merging.

@gvanrossum
Copy link
Member

PS. You need a news blurb.

@gvanrossum
Copy link
Member

We're all good except the news blurb. If you can't handle the tooling for that (just click on "Details" for the failing test) let me know and I'll make something up. Then we will land. This is now a 3.11 release blocker.

@neonene
Copy link
Contributor Author

neonene commented Jun 29, 2022

3.12a0+ (2022-6-27 edb10ca) PGO

> pyperf compare_to original patched -G --min-speed=2

Slower (6):
- richards: 79.8 ms +- 1.3 ms -> 85.5 ms +- 1.5 ms: 1.07x slower
- sympy_sum: 305 ms +- 5 ms -> 322 ms +- 28 ms: 1.05x slower
- hexiom: 10.1 ms +- 0.1 ms -> 10.5 ms +- 0.1 ms: 1.04x slower
- unpickle_list: 7.04 us +- 0.09 us -> 7.27 us +- 0.09 us: 1.03x slower
- pyflate: 727 ms +- 21 ms -> 749 ms +- 12 ms: 1.03x slower
- scimark_sor: 191 ms +- 3 ms -> 196 ms +- 4 ms: 1.03x slower

Faster (21):
- sqlalchemy_imperative: 56.2 ms +- 8.3 ms -> 44.7 ms +- 1.0 ms: 1.26x faster
- tornado_http: 484 ms +- 20 ms -> 392 ms +- 4 ms: 1.23x faster
- dulwich_log: 235 ms +- 32 ms -> 193 ms +- 6 ms: 1.22x faster
- logging_silent: 168 ns +- 7 ns -> 143 ns +- 3 ns: 1.17x faster
- unpack_sequence: 77.9 ns +- 1.3 ns -> 70.0 ns +- 0.5 ns: 1.11x faster
- logging_format: 26.0 us +- 1.7 us -> 23.7 us +- 0.5 us: 1.10x faster
- xml_etree_iterparse: 190 ms +- 14 ms -> 177 ms +- 3 ms: 1.08x faster
- nbody: 168 ms +- 3 ms -> 157 ms +- 6 ms: 1.07x faster
- logging_simple: 22.9 us +- 2.0 us -> 21.5 us +- 0.4 us: 1.06x faster
- spectral_norm: 164 ms +- 3 ms -> 154 ms +- 4 ms: 1.06x faster
- sqlalchemy_declarative: 273 ms +- 25 ms -> 261 ms +- 7 ms: 1.05x faster
- xml_etree_parse: 253 ms +- 15 ms -> 242 ms +- 2 ms: 1.05x faster
- scimark_monte_carlo: 103 ms +- 3 ms -> 98.6 ms +- 1.6 ms: 1.05x faster
- unpickle_pure_python: 362 us +- 24 us -> 346 us +- 4 us: 1.04x faster
- scimark_fft: 473 ms +- 8 ms -> 453 ms +- 11 ms: 1.04x faster
- regex_v8: 30.2 ms +- 1.1 ms -> 29.3 ms +- 0.3 ms: 1.03x faster
- mako: 19.3 ms +- 0.8 ms -> 18.7 ms +- 0.3 ms: 1.03x faster
- crypto_pyaes: 127 ms +- 7 ms -> 124 ms +- 2 ms: 1.03x faster
- float: 139 ms +- 2 ms -> 136 ms +- 1 ms: 1.03x faster
- xml_etree_process: 102 ms +- 4 ms -> 99.6 ms +- 1.4 ms: 1.02x faster
- chaos: 134 ms +- 2 ms -> 131 ms +- 1 ms: 1.02x faster

Benchmark hidden because not significant (32): 2to3, chameleon, deltablue, django_template, fannkuch
, go, html5lib, json_dumps, json_loads, meteor_contest, nqueens, pathlib, pickle, pickle_dict, pickl
e_list, pickle_pure_python, pidigits, python_startup, python_startup_no_site, raytrace, regex_compil
e, regex_dna, regex_effbot, scimark_lu, scimark_sparse_mat_mult, sqlite_synth, sympy_expand, sympy_i
ntegrate, sympy_str, telco, unpickle, xml_etree_generate

Geometric mean: 1.02x faster

@neonene
Copy link
Contributor Author

neonene commented Jun 30, 2022

3.11b3+ (2022-6-29 a548a45) PGO

> pyperf compare_to original patched -G --min-speed=2

Slower (3):
- logging_silent: 162 ns +- 3 ns -> 192 ns +- 3 ns: 1.19x slower
- python_startup: 17.0 ms +- 0.3 ms -> 17.7 ms +- 2.0 ms: 1.04x slower
- logging_simple: 21.9 us +- 0.7 us -> 22.6 us +- 0.9 us: 1.03x slower

Faster (31):
- tornado_http: 492 ms +- 89 ms -> 405 ms +- 15 ms: 1.22x faster
- spectral_norm: 179 ms +- 3 ms -> 162 ms +- 4 ms: 1.10x faster
- unpickle_pure_python: 413 us +- 13 us -> 375 us +- 3 us: 1.10x faster
- nbody: 170 ms +- 1 ms -> 157 ms +- 7 ms: 1.08x faster
- scimark_fft: 508 ms +- 8 ms -> 475 ms +- 12 ms: 1.07x faster
- unpack_sequence: 81.9 ns +- 1.8 ns -> 77.0 ns +- 1.5 ns: 1.06x faster
- pickle_pure_python: 581 us +- 9 us -> 555 us +- 9 us: 1.05x faster
- scimark_sparse_mat_mult: 6.50 ms +- 0.12 ms -> 6.22 ms +- 0.19 ms: 1.05x faster
- float: 145 ms +- 2 ms -> 139 ms +- 3 ms: 1.04x faster
- 2to3: 542 ms +- 27 ms -> 520 ms +- 17 ms: 1.04x faster
- mako: 19.3 ms +- 0.2 ms -> 18.5 ms +- 0.1 ms: 1.04x faster
- scimark_sor: 204 ms +- 2 ms -> 196 ms +- 3 ms: 1.04x faster
- sqlite_synth: 5.40 us +- 0.04 us -> 5.19 us +- 0.03 us: 1.04x faster
- unpickle: 23.0 us +- 0.5 us -> 22.1 us +- 0.4 us: 1.04x faster
- richards: 89.4 ms +- 1.1 ms -> 86.3 ms +- 0.7 ms: 1.04x faster
- pathlib: 186 ms +- 8 ms -> 180 ms +- 5 ms: 1.04x faster
- scimark_monte_carlo: 107 ms +- 1 ms -> 103 ms +- 1 ms: 1.04x faster
- telco: 11.7 ms +- 0.3 ms -> 11.3 ms +- 0.2 ms: 1.03x faster
- chaos: 137 ms +- 2 ms -> 133 ms +- 1 ms: 1.03x faster
- regex_compile: 236 ms +- 2 ms -> 229 ms +- 3 ms: 1.03x faster
- nqueens: 162 ms +- 2 ms -> 157 ms +- 3 ms: 1.03x faster
- xml_etree_generate: 146 ms +- 4 ms -> 142 ms +- 3 ms: 1.03x faster
- raytrace: 571 ms +- 7 ms -> 554 ms +- 9 ms: 1.03x faster
- pyflate: 775 ms +- 8 ms -> 754 ms +- 6 ms: 1.03x faster
- scimark_lu: 158 ms +- 2 ms -> 154 ms +- 2 ms: 1.03x faster
- crypto_pyaes: 126 ms +- 4 ms -> 123 ms +- 3 ms: 1.02x faster
- hexiom: 10.9 ms +- 0.2 ms -> 10.7 ms +- 0.1 ms: 1.02x faster
- deltablue: 7.07 ms +- 0.10 ms -> 6.92 ms +- 0.09 ms: 1.02x faster
- sympy_str: 545 ms +- 19 ms -> 534 ms +- 10 ms: 1.02x faster
- django_template: 85.8 ms +- 1.3 ms -> 84.1 ms +- 1.6 ms: 1.02x faster
- html5lib: 107 ms +- 3 ms -> 105 ms +- 2 ms: 1.02x faster

Benchmark hidden because not significant (25): chameleon, dulwich_log, fannkuch, go, json_dumps, jso
n_loads, logging_format, meteor_contest, pickle, pickle_dict, pickle_list, pidigits, python_startup_
no_site, regex_dna, regex_effbot, regex_v8, sqlalchemy_declarative, sqlalchemy_imperative, sympy_exp
and, sympy_integrate, sympy_sum, unpickle_list, xml_etree_parse, xml_etree_iterparse, xml_etree_proc
ess

Geometric mean: 1.02x faster

@neonene neonene changed the title gh-91719: Make MSVC generate faster switch code which avoids extra stack access gh-91719: Reload opcode on unknown error so that C can optimize the dispatching in ceval.c Jun 30, 2022
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this.

@gvanrossum gvanrossum merged commit ea39b77 into python:main Jun 30, 2022
@gvanrossum gvanrossum added the needs backport to 3.11 only security fixes label Jun 30, 2022
@miss-islington
Copy link
Contributor

Thanks @neonene for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 30, 2022
… the dispatching in ceval.c (pythonGH-94364)

(cherry picked from commit ea39b77)

Co-authored-by: neonene <[email protected]>
@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jun 30, 2022
@bedevere-bot
Copy link

GH-94453 is a backport of this pull request to the 3.11 branch.

@neonene
Copy link
Contributor Author

neonene commented Jun 30, 2022

Thanks for reviewing and merging.

@neonene neonene deleted the evalswitch branch June 30, 2022 16:02
gvanrossum pushed a commit that referenced this pull request Jun 30, 2022
…ispatching in ceval.c (GH-94364) (#94453)

(cherry picked from commit ea39b77)

Co-authored-by: neonene <[email protected]>
gvanrossum pushed a commit to gvanrossum/cpython that referenced this pull request Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants