-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
py/nlr: Factor out common NLR code to generic functions. #3492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Each NLR implementation (Thumb, x86, x64, xtensa, setjmp) duplicates a lot of the NLR code, specifically that dealing with pushing and popping the NLR pointer to maintain the linked-list of NLR buffers. This patch factors all of that code out of the specific implementations into generic functions in nlr.c. This eliminates duplicated code. The factoring also allows to make the machine-specific NLR code pure assembler code, thus allowing nlrthumb.c to use naked function attributes in the correct way (naked functions can only have basic inline assembler code in them). There is a small overhead introduced (typically 1 machine instruction) because now the generic nlr_jump() must call nlr_jump_tail() rather than them being one combined function.
unsigned int nlr_push_tail(nlr_buf_t *nlr) asm("nlr_push_tail"); | ||
#else | ||
__attribute__((used)) unsigned int nlr_push_tail(nlr_buf_t *nlr); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stinos I might need your help here if you don't mind: what exactly was the original intention of this code (copied from the old nlrx64.c), and what is the right thing to do now with this refactoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments on #2923. I verified it and it has to stay else the 32bit mingw builds still have an undefined reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for verifying. The issue was that this code broke Apveyor but now I see the error: this "asm" decl is only needed for x86, not x86-64. I updated the PR to reflect this and it should now be the same as before the refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I didn't notice that, I only tried x86 and x86-64 builds with mingw.. Anyway this PR looks ok and I'll look into the differences for cygwin/mingw/... later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into the differences for cygwin/mingw/... later on.
Thanks. There's no hurry for this.
void *regs[12]; | ||
#else | ||
void *regs[8]; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stinos furthermore: for __CYGWIN__
why is it 12 regs and not 10 (an extra 2 for rsi and rdi compared with non-unix)? And why isn't this check defined(_WIN32) || defined(__CYGWIN__)
to match the definition of the NLR_OS_WINDOWS
macro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. This was introduced years ago by you and it didn't change since. However I think this might actually have been an oversight in #1632 and follow up bf1570c: there the _WIN32 got introduced but nlr.h has not been changed. And this seems very wrong (not tested, but in nlrx64 I see more movq calls than there is space in nlr->regs, right?) So that might even be the reason why 64bit msys mingw builds segfault (assuming they still do). Not sure what the best way to solve this would be now: I don't have time to test all of this right now, though I do next week. So either you leave it as-is and we'll see later one, or you do something like declaring NLR_OS_WINDOWS in nlr.h so it can be used both there to enable set MICROPY_NLR_NUM_REGS to 12 and in nlr.c?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was introduced years ago by you and it didn't change since
Indeed! And it looks like it allocates 2 more slots than it needs to.
there the _WIN32 got introduced but nlr.h has not been changed. And this seems very wrong (not tested, but in nlrx64 I see more movq calls than there is space in nlr->regs, right?)
Yes it seems wrong.
Ok, let's leave this particular issue until later. For now I've aimed to make this PR not change any behaviour, as far as that is possible.
And it's no longer unconditionally included by nlr.h, only if NLR_SETJMP is defined.
This was merged in 6a3a742 |
Each NLR implementation (Thumb, x86, x64, xtensa, setjmp) duplicates a lot
of the NLR code, specifically that dealing with pushing and popping the NLR
pointer to maintain the linked-list of NLR buffers. This patch factors all
of that code out of the specific implementations into generic functions in
nlr.c. This eliminates duplicated code.
The factoring also allows to make the machine-specific NLR code pure
assembler code, thus allowing nlrthumb.c to use naked function attributes
in the correct way (naked functions can only have basic inline assembler
code in them).
There is a small overhead introduced (typically 1 machine instruction)
because now the generic nlr_jump() must call nlr_jump_tail() rather than
them being one combined function.
Should fix #3484