Fix linker failure when building opcache statically #18939

arnaud-lb · 2025-06-25T09:01:26Z

RFC: https://wiki.php.net/rfc/make_opcache_required

We use linker relocations to fetch the TLS index and offset of _tsrm_ls_cache. When building Opcache statically, linkers may attempt to optimize that into a more efficient code sequence (relaxing from "General Dynamic" to "Local Exec" model [1]). Unfortunately, linkers will fail, rather than ignore our relocations, when they don't recognize the exact code sequence they are expecting.

This results in errors as reported by #15074:

TLS transition from R_X86_64_TLSGD to R_X86_64_GOTTPOFF against
`_tsrm_ls_cache' at 0x12fc3 in section `.text' failed"

Here I take a different approach:

Emit the exact full code sequence expected by linkers
Extract the TLS index/offset by inspecting the linked ASM code, rather than executing it (execution would give us the thread-local address).
We detect when the code was relaxed, in which case we can extract the TCB offset instead.
This is done in a conservative way so that if the linker did something we didn't expect, we fallback to a safer (but slower) mechanism.

One benefit of that is we are now able to use the Local Exec model in more cases, in JIT'ed code. This makes non-glibc builds faster in these cases.

This is tested on Linux (glibc, musl), FreeBSD, MacOS, Windows; lld, gold, bdf; clang; gcc; VS; x86, x86_64, aarch64 (not MacOS/Apple Silicon, as JIT+ZTS is not supported on this combo yet, and not Windows/aarch64 for the same reason), with various combinations of static/shared/dl(). The PR includes these tests. Other OSes fallback to the slower mechanism.

[1] https://www.akkadia.org/drepper/tls.pdf

This fixes the following linker error: TLS transition from R_X86_64_TLSGD to R_X86_64_GOTTPOFF against `_tsrm_ls_cache' at 0x12fc3 in section `.text' failed" The error arises from how we obtain information about the _tsrm_ls_cache TLS variable for use in JIT'ed code: Normally, TLS variables are resolved via linker relocations [1], which of course can not be used in JIT'ed code. Therefore we emit the relocation in AOT code and use the result in JIT. Specifically we use a fragment of the "General Dynamic" code sequence described in [1]. Using the full code sequence would give us the address of the variable in the current thread. Therefore we only use a fragment that gives us the variable's TLS index and offset. When Opcache is statically linked into the binary, linkers attempt to relax (rewrite) this code sequence into a more efficient one. However, this fails because they will not recognize the code sequence. We now take a different approach: * Emit the exact full code sequence expected by linkers * Extract the TLS index/offset or TCB offset by inspecting the ASM code, rather than executing it (execution would give us the thread-local address). * This is done in a conservative way so that if the linker did something we didn't expect, we fallback to a safer (but slower) mechanism. [1] https://www.akkadia.org/drepper/tls.pdf

TimWolla

I cannot comment meaningfully on the code itself, but it certainly looks more maintainable now and I like that it's independently tested in CI.

ext/opcache/jit/tls/testing/main.c

ext/opcache/jit/tls/testing/test.sh

ext/opcache/jit/tls/zend_jit_tls_x86.c

TimWolla

Apart from the small typo I have no further remarks. I leave the actual review to someone who knows how this is supposed to work.

ext/opcache/jit/tls/zend_jit_tls_x86_64.c

Co-authored-by: Tim Düsterhus <[email protected]>

nielsdos · 2025-06-25T21:30:58Z

Good grief.
I will check this over the weekend. Honestly I hope there's another way. This is quite complex.

arnaud-lb · 2025-06-26T06:42:58Z

Thank you @nielsdos. It's not incredibly complex when considering that we compare the bytecode verbatim (ignoring the imm operands). There is no full blown bytecode decoding or interpreting.

In intel bytecode imms are encoded on whole bytes, so we compare the rest. In arm we mask the imms out of the fixed-size 32bit instructions before comparison.

In zend_jit_resolve_tsrm_ls_cache_offsets() there are just two possibilities:

The bytecode is exactly what we emitted.
The bytecode changed, and is exactly what we expect the linker may have emitted instead.

In both cases we extract imm values at fixed positions in the bytecode. The rest is what we did before: In the first case the imm values give us the address of the TLS descriptor, and in the second case the TCB offset.

nielsdos · 2025-06-26T06:55:38Z

Yes I understood the idea on a high level, but considering it deals with different platforms and instruction encodings, I would not call it simple. I wish there was a different solution

arnaud-lb · 2025-06-26T08:20:07Z

I wish too. Unfortunately I didn't find a good alternative.

Here is what I've checked:

Disabling linker relaxation would have solved the issue, but it's not possible. Relaxation is just part of relocation processing and can not be disabled, in linkers I've checked.
Forcing _tsrm_ls_cache to use the global-dynamic model or using a JIT-specific TSRM cache variable with a forced model might work, but using this model in JIT is slower. One additional benefit of this PR is that it makes the symfony demo benchmark 3% faster in some cases, because JIT is now able to use the local-exec model in these cases. Also I can't seem to be able to actually force the dynamic model.
Emitting code that's not eligible for relaxation (already relaxed), so the linker doesn't attempt it, would have worked as well. E.g. emitting _tsrm_ls_cache@gottpoff when we know the variable is eligible for the local-exec model, or _tsrm_ls_cache@tlsgd otherwise. But we can not know that in advance, e.g. a static archive of PHP may be ultimately linked into a shared library or an executable, making gottpoff invalid in the former case or tlsgd relaxable in the latter.
Having a separate TSRM cache for JIT, backed by pthread_get_key / pthread_getspecific so we don't have to deal with relocations. This is what v8/jsc do as far as I can see, but they only support "inlining" pthread_getspecific on MacOS and Windows. Unfortunately inlining it on all platforms we care about would rely on considerably more platform/libc specificity, with no ABI guarantees at all.

github-actions bot added Category: Build System Extension: opcache labels Jun 25, 2025

TimWolla reviewed Jun 25, 2025

View reviewed changes

ext/opcache/jit/tls/testing/main.c Outdated Show resolved Hide resolved

ext/opcache/jit/tls/testing/test.sh Show resolved Hide resolved

ext/opcache/jit/tls/testing/test.sh Show resolved Hide resolved

ext/opcache/jit/tls/zend_jit_tls_x86.c Outdated Show resolved Hide resolved

arnaud-lb added 4 commits June 25, 2025 11:30

WS

bf54d95

Fix

93f2cb1

Use memcmp() for consistency

44dc3dc

Fix FreeBSD CI

d2ef089

TimWolla reviewed Jun 25, 2025

View reviewed changes

ext/opcache/jit/tls/zend_jit_tls_x86_64.c Outdated Show resolved Hide resolved

Update ext/opcache/jit/tls/zend_jit_tls_x86_64.c

8f36ff9

Co-authored-by: Tim Düsterhus <[email protected]>

arnaud-lb requested a review from nielsdos June 25, 2025 11:44

arnaud-lb marked this pull request as ready for review June 25, 2025 11:45

arnaud-lb requested a review from dstogov as a code owner June 25, 2025 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix linker failure when building opcache statically #18939

Fix linker failure when building opcache statically #18939

arnaud-lb commented Jun 25, 2025 •

edited

Loading

Uh oh!

TimWolla left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TimWolla left a comment

Uh oh!

Uh oh!

nielsdos commented Jun 25, 2025

Uh oh!

arnaud-lb commented Jun 26, 2025

Uh oh!

nielsdos commented Jun 26, 2025

Uh oh!

arnaud-lb commented Jun 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Fix linker failure when building opcache statically #18939

Are you sure you want to change the base?

Fix linker failure when building opcache statically #18939

Conversation

arnaud-lb commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TimWolla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TimWolla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nielsdos commented Jun 25, 2025

Uh oh!

arnaud-lb commented Jun 26, 2025

Uh oh!

nielsdos commented Jun 26, 2025

Uh oh!

arnaud-lb commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

arnaud-lb commented Jun 25, 2025 •

edited

Loading

arnaud-lb commented Jun 26, 2025 •

edited

Loading