-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
GH-115802: JIT "small" code for Windows #115964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I realized halfway through the review that this touches upon several large blank spaces in my brain. So this is not a very useful review. I do think that _PyOptimize_Optimize()
belongs in an internal header though.
assert(oparg >= FVC_STR && oparg <= FVC_ASCII); | ||
conv_fn = CONVERSION_FUNCTIONS[oparg]; | ||
conv_fn = _PyEval_ConversionFuncs[oparg]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make inst(CONVERT_VALUE
replicate(4) inst(CONVERT_VALUE
and rely on clang removing the lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to do that where a fifth generic version isn't generated too? Otherwise, it doesn't help, so we might as well stay with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I've a few comments.
As a general comment regarding JIT development:
Don't be afraid to make small changes to bytecodes.c
and other interpreter files to enable the JIT to work more smoothly.
PyAPI_DATA(const binaryfunc) _PyEval_BinaryOps[]; | ||
PyAPI_DATA(const conversion_func) _PyEval_ConversionFuncs[]; | ||
|
||
PyAPI_FUNC(int) _PyEval_CheckExceptStarTypeValid(PyThreadState *tstate, PyObject* right); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than exposing all these symbols, could we put the function pointers in a struct, and pass that to the JIT as an argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue isn't finding symbols when jitting the code (jit.c
is linked into the main executable and can find everything just fine).
The purpose of adding PyAPI_FUNC(...)
and PyAPI_DATA(...)
to the declarations is so they are declared with __declspec(dllimport)
when compiling the templates. This makes Clang emit indirect memory accesses on Windows (similar to position-independent code on other platforms).
|
For 32-bit Windows, the small code model works fine out-of-the-box.
For 64-bit Windows, we don't have position-independent code with GOT relocations like Linux and macOS. Instead, we compile the stencils like a
Py_BUILD_CORE_MODULE
binary extension module, which creates a level of indirection similar to a GOT (__impl_Py_XXX
holds the address ofPy_XXX
). We process this just like the GOT on other platforms, and everything just works.This does require adding
PyAPI_FUNC
andPyAPI_DATA
to some symbols in internal headers to get the correct visibility, which is why this PR touches so many files.Looks like this makes the Windows JIT ~3-4% faster.