gcc 4.7.2 miscompiles stackless #19

Description
Originally reported by: RMTEW FULL NAME (Bitbucket: rmtew, GitHub: rmtew)
(originally reported in Trac by @akruis on 2013-04-07 09:39:02)
Compilers constantly improve. When I tried to build Stackless 2.7.4rc1 with gcc 4.7.2 the stackless unittests didn't terminate. Python went into an endless loop just before terminating.
It turned out that gcc overly optimised climb_stack_and_transfer. The compiler removed the alloca call and then performed a tail recursion optimisation. Pretty cool. :-)
Details: linux amd64, gcc 4.7.2, options "-O2". The relevant command line switches are
-foptimize-sibling-calls
-ftree-vrp
-ftree-dce
Because this optimisation does not depend on the architecture, I suspect that this problem affects
other architectures and stackless versions too.
I see two possibilities to fix this issue:
-
- Add some #pragmas or specific compiler switches to inhibit the optimisation.
-
- Use the pointer returned from alloca, i.e. store the pointer in a global variable.
I prefer option 2, because it is less compiler specific. And the overhead of an additional write is negligible.