-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
py/nlrthumb: Do not mark nlr_push as not returning anything. #3612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I see one of the tests fails, but it looks unrelated:
|
Since the comment on __builtin_reachable says what it's there for, and that seems a good reason, I don't think simply removing this is acceptable (not sure, I didn't write it). I'd think the fix is to leave it there, conditionally. Like, should be there by default, but shouldn't be there when -flto is active and gcc version is new enough. |
That should be fixed now. If you rebase on to the latest then Travis should build OK. |
As you see, it's not easy to get this right :) Especially to support all versions of gcc (of which older ones definitely seem buggy) and also LTO (which seems a bit fragile with respect to hand-written assembler code).
That's also what I'd suggest. @aykevl are you able to add such a check for the version (not sure which version...) and/or LTO? |
921c5e7
to
87f8b47
Compare
I've investigated this a bit more, and updated the PR. Some older GCC versions indeed complain about not returning anything in a naked function. Version 4.6 and lower does, version 5.4 and up doesn't, so I've added a To further verify that without 00010344 <main>:
10344: e92d4010 push {r4, lr}
10348: eb000055 bl 104a4 <foo>
1034c: e1a01000 mov r1, r0
10350: e59f0008 ldr r0, [pc, #8] ; 10360 <main+0x1c>
10354: ebffffeb bl 10308 <printf@plt>
10358: e3a00000 mov r0, #0
1035c: e8bd8010 pop {r4, pc}
10360: 0001051c .word 0x0001051c with 00010314 <main>:
10314: e92d4010 push {r4, lr}
10318: eb00004f bl 1045c <foo> So all in all I think |
Note: it increases code size on Travis (using GCC 4.8). I'll test a few things and update this comment. UPDATE Again, there is a failing test but I can't see how it's related to this PR:
|
@aykevl thanks for the detailed investigation and testing. I'm happy with the PR as it stands now. |
py/nlrthumb.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that this will probably not work well with non-GNUC compilers, because the __GNUC__ < 4
check will probably succeed if __GNUC__
is not defined. So I guess it needs to retain the `defined(GNUC) bit at the start of the if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, that check should be there. Fixed it.
By adding __builtin_unreachable() at the end of nlr_push, we're essentially telling the compiler that this function will never return. When GCC LTO is in use, this means that any time nlr_push() is called (which is often), the compiler thinks this function will never return and thus eliminates all code following the call. Note: I've added a 'return 0' for older GCC versions like 4.6 which complain about not returning anything (which doesn't make sense in a naked function). Newer GCC versions (tested 4.8, 5.4 and some others) don't complain about this.
Thanks for updating the PR, and thanks for writing a good commit message! Merged. |
Moving Adafruit_CircuitPython_BusDevice to core
By adding
__builtin_unreachable()
at the end ofnlr_push
, we're essentially telling the compiler that this function will never return. When GCC link-time optimisation is in use, this means that any timenlr_push()
is called (which is often), the compiler thinks this function will never return and thus eliminates all code following the call. It breaks the nrf port, which uses -flto to reduce code size. When the port compiles with 97cc485, the code size is reduced by about 10K so it was easy to see there's some invalid code elimination going on.Note: older GCC versions might complain about a missing return statement, but at least version 5.4.1 (Debian stretch) doesn't. If there are any versions that complain, I would propose inserting a return statement anyway for that GCC version and lower as a workaround.
See also:
#3484
#3492
97cc485 (introducing the issue)