Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 866d7c8

Browse files
authored
feat: otel thread ctx FFI (#1915)
# What does this PR do? This PR adds a basic FFI for the OTel thread-level context feature: create a new context, attach, detach, and update in place. We also make `ThreadContextRecord` public, or at least exposed in the FFI. The rationale is that: 1. it's imposed by the spec, so it should not be a liability regarding breaking changes: we can't really touch it anyway. 2. as mentioned in the doc of the FFI, there's a potential for SDK updating themselves the contexts without going through libdatadog at all after publication. In this usage mode, the export of the C struct `ThreadContextRecord` is a way to document its expected memory layout. <details> <summary>Generated C header</summary> ```c // Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/ // SPDX-License-Identifier: Apache-2.0 #ifndef DDOG_OTEL_THREAD_CTX_H #define DDOG_OTEL_THREAD_CTX_H #pragma once #include <stdbool.h> #include <stddef.h> #include <stdint.h> /** * In-memory layout of a thread-level context. * * **CAUTION**: The structure MUST match exactly the OTel thread-level context specification. * It is read by external, out-of-process code. Do not re-order fields or modify in any way, * unless you know exactly what you're doing. * * # Synchronization * * Readers are async-signal handlers. The writer is always stopped while a reader runs. * Sharing memory with a signal handler still requires some form of synchronization, which is * achieved through atomics and compiler fence, using `valid` and/or the TLS slot as * synchronization points. * * - The writer stores `valid = 0` *before* modifying fields in-place, guarded by a fence. * - The writer stores `valid = 1` *after* all fields are populated, guarded by a fence. * - `valid` starts at `1` on construction and is never set to `0` except during an in-place * update. */ typedef struct ddog_ThreadContextRecord { /** * Trace identifier; all-zeroes means "no trace". */ uint8_t trace_id[16]; /** * Span identifier. */ uint8_t span_id[8]; /** * Whether the record is ready/consistent. Always set to `1` except during in-place update * of the current record. */ uint8_t valid; uint8_t _reserved; /** * Number of populated bytes in `attrs_data`. */ uint16_t attrs_data_size; /** * Packed variable-length key-value records. * * It's a contiguous list of blocks with layout: * * 1. 1-byte `key_index` * 2. 1-byte `val_len` * 3. `val_len` bytes of a string value. * * # Size * * Currently, we always allocate the max recommended size. This potentially wastes a few * hundred bytes per thread, but it guarantees that we can modify the context in-place * without (re)allocation in the hot path. Having a hybrid scheme (starting smaller and * resizing up a few times) is not out of the question. */ uint8_t attrs_data[ddog_MAX_ATTRS_DATA_SIZE]; } ddog_ThreadContextRecord; #ifdef __cplusplus extern "C" { #endif // __cplusplus /** * Allocate and initialise a new thread context. * * Returns a non-null owned handle that must eventually be released with * `ddog_otel_thread_ctx_free`. */ struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_new(const uint8_t (*trace_id)[16], const uint8_t (*span_id)[8], const uint8_t (*local_root_span_id)[8]); /** * Free an owned thread context. * * # Safety * * `ctx` must be a valid non-null pointer obtained from `ddog_otel_thread_ctx_new` or * `ddog_otel_thread_ctx_detach`, and must not be used after this call. In particular, `ctx` * must not be currently attached to a thread. */ void ddog_otel_thread_ctx_free(struct ddog_ThreadContextRecord *ctx); /** * Attach `ctx` to the current thread. Returns the previously attached context if any, or null * otherwise. * * # Safety * * `ctx` must be a valid non-null pointer obtained from this API. Ownership of `ctx` is * transferred to the TLS slot: the caller must not drop `ctx` while it is still actively * attached. * * ## In-place update * * The preferred method to update the thread context in place is [ddog_otel_thread_ctx_update]. * * If calling into native code is too costly, it is possible to update an attached context * directly in-memory without going through libdatadog (contexts are guaranteed to have a * stable address through their lifetime). **HOWEVER, IF DOING SO, PLEASE BE VERY CAUTIOUS OF * THE FOLLOWING POINTS**: * * 1. The update process requires a [seqlock](https://en.wikipedia.org/wiki/Seqlock)-like * pattern: [ThreadContextRecord::valid] must be first set to `0` before the update and set * to `1` again at the end. Additionally, depending on your language's memory model, you * might need specific synchronization primitives (compiler fences, atomics, etc.), since * the context can be read by an asynchronous signal handler at any point in time. See the * [Otel thread context * specification](open-telemetry/opentelemetry-specification#4947) * for more details. * 2. Only update the context from the thread it's attached to. Contexts are designed to be * attached, written to and read from on the same thread (whether from signal code or * program code). Thus, they are NOT thread-safe. Given the current specification, I don't * think it's possible to safely update an attached context from a different thread, since * the signal handler doesn't assume the context can be written to concurrently from another * thread. */ struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_attach(struct ddog_ThreadContextRecord *ctx); /** * Remove the currently attached context from the TLS slot. * * Returns the detached context (caller now owns it and must release it with * `ddog_otel_thread_ctx_free`), or null if the slot was empty. */ struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_detach(void); /** * Update the currently attached context in-place. * * If no context is currently attached, one is created and attached, equivalent to calling * `ddog_otel_thread_ctx_new` followed by `ddog_otel_thread_ctx_attach`. */ void ddog_otel_thread_ctx_update(const uint8_t (*trace_id)[16], const uint8_t (*span_id)[8], const uint8_t (*local_root_span_id)[8]); #ifdef __cplusplus } // extern "C" #endif // __cplusplus #endif /* DDOG_OTEL_THREAD_CTX_H */ ``` </details> # Motivation OTel thread-level context has been implemented in #1791 in order to provide better interop with the OTel eBPF profiler. The first user is supposed to be dd-trace-rs, but it turns out the dotnet SDK people are interested in using it as well (and eventually other non-Rust SDKs will use it and thus require an FFI). # Additional Notes N/A # How to test the change? There's a test to check that the TLS symbol is properly handled. For real usage, we plan to check when integrating in dotnet (or whichever is the first SDK to use it). Co-authored-by: yann.hamdaoui <[email protected]>
1 parent c713122 commit 866d7c8

11 files changed

Lines changed: 357 additions & 13 deletions

File tree

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ libdd-http-client @DataDog/apm-common-components-core
4848
libdd-library-config*/ @DataDog/apm-sdk-capabilities-rust
4949
libdd-log*/ @DataDog/apm-common-components-core
5050
libdd-otel-thread-ctx/ @DataDog/apm-common-components-core
51+
libdd-otel-thread-ctx-ffi/ @DataDog/apm-common-components-core
5152
libdd-profiling*/ @DataDog/libdatadog-profiling
5253
libdd-sampling/ @DataDog/apm-common-components-core
5354
libdd-shared-runtime*/ @DataDog/apm-common-components-core

Cargo.lock

Lines changed: 9 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ members = [
1616
"datadog-live-debugger",
1717
"datadog-live-debugger-ffi",
1818
"libdd-otel-thread-ctx",
19+
"libdd-otel-thread-ctx-ffi",
1920
"libdd-profiling",
2021
"libdd-profiling-ffi",
2122
"libdd-profiling-protobuf",
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
[package]
5+
name = "libdd-otel-thread-ctx-ffi"
6+
version = "1.0.0"
7+
description = "FFI bindings for the OTel thread-level context publisher"
8+
edition.workspace = true
9+
rust-version.workspace = true
10+
license.workspace = true
11+
publish = false
12+
13+
[lib]
14+
crate-type = ["staticlib", "cdylib", "lib"]
15+
bench = false
16+
17+
[dependencies]
18+
libdd-common-ffi = { path = "../libdd-common-ffi", default-features = false }
19+
libdd-otel-thread-ctx = { path = "../libdd-otel-thread-ctx" }
20+
21+
[features]
22+
default = ["cbindgen"]
23+
cbindgen = ["build_common/cbindgen", "libdd-common-ffi/cbindgen"]
24+
25+
[build-dependencies]
26+
build_common = { path = "../build-common" }

libdd-otel-thread-ctx-ffi/build.rs

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
2+
// SPDX-License-Identifier: Apache-2.0
3+
extern crate build_common;
4+
5+
use build_common::generate_and_configure_header;
6+
use std::env;
7+
use std::path::PathBuf;
8+
use std::process::Command;
9+
10+
/// Locate the `gcc-ld/` shim directory shipped with the Rust toolchain.
11+
///
12+
/// This directory contains an `ld.lld` wrapper that delegates to `rust-lld`.
13+
/// Passing it via `-B` to the C compiler driver makes it discover rust-lld
14+
/// before any system-wide lld, which
15+
///
16+
/// 1. Avoid the need of a system-wide LLD install
17+
/// 2. Pick a recent LLD, as opposed to e.g. CentOS 7' LLVM7 which is too old to handle TLSDESC
18+
/// relocations properly.
19+
fn find_rust_lld_dir() -> Option<PathBuf> {
20+
let rustc = env::var("RUSTC").unwrap_or_else(|_| "rustc".into());
21+
let target = env::var("TARGET").ok()?;
22+
23+
let output = Command::new(&rustc)
24+
.arg("--print")
25+
.arg("sysroot")
26+
.output()
27+
.ok()?;
28+
29+
let sysroot = std::str::from_utf8(&output.stdout).ok()?.trim();
30+
let dir = PathBuf::from(sysroot)
31+
.join("lib/rustlib")
32+
.join(&target)
33+
.join("bin/gcc-ld");
34+
35+
dir.join("ld.lld").exists().then_some(dir)
36+
}
37+
38+
fn main() {
39+
generate_and_configure_header("otel-thread-ctx.h");
40+
let target_os = env::var("CARGO_CFG_TARGET_OS").unwrap();
41+
42+
// Export the TLSDESC thread-local variable to the dynamic symbol table so external readers
43+
// (e.g. the eBPF profiler) can discover it. Rust's cdylib linker applies a version script with
44+
// `local: *` that hides all symbols not explicitly allowlisted, and also causes lld to relax
45+
// the TLSDESC access, eliminating the dynsym entry entirely.
46+
//
47+
// Passing our own version script with an explicit `global:` entry for the symbol beats the
48+
// `local: *` wildcard and prevents that relaxation.
49+
//
50+
// Merging multiple version scripts is not supported by GNU ld, so we need lld. We prefer the
51+
// toolchain's bundled rust-lld (LLD 19+ since Rust 1.84) over the system lld (if it even
52+
// exists). If rust-lld is not found we fall back to whatever `lld` the system provides.
53+
if target_os == "linux" {
54+
let manifest_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
55+
56+
if let Some(gcc_ld_dir) = find_rust_lld_dir() {
57+
println!("cargo:rustc-cdylib-link-arg=-B{}", gcc_ld_dir.display());
58+
}
59+
println!("cargo:rustc-cdylib-link-arg=-fuse-ld=lld");
60+
println!(
61+
"cargo:rustc-cdylib-link-arg=-Wl,--version-script={manifest_dir}/tls-dynamic-list.txt"
62+
);
63+
}
64+
}
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
language = "C"
5+
cpp_compat = true
6+
tab_width = 2
7+
header = """// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
8+
// SPDX-License-Identifier: Apache-2.0
9+
"""
10+
include_guard = "DDOG_OTEL_THREAD_CTX_H"
11+
style = "both"
12+
pragma_once = true
13+
no_includes = true
14+
sys_includes = ["stdbool.h", "stddef.h", "stdint.h"]
15+
16+
[parse]
17+
parse_deps = true
18+
include = ["libdd-common-ffi", "libdd-otel-thread-ctx"]
19+
20+
[export]
21+
prefix = "ddog_"
22+
renaming_overrides_prefixing = true
23+
24+
[export.mangle]
25+
rename_types = "PascalCase"
26+
27+
[enum]
28+
prefix_with_name = true
29+
rename_variants = "ScreamingSnakeCase"
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
//! FFI bindings for the OTel thread-level context publisher.
5+
//!
6+
//! All symbols are only available on Linux, since spec is currently Linux-specific.
7+
8+
#[cfg(target_os = "linux")]
9+
pub use linux::*;
10+
11+
#[cfg(target_os = "linux")]
12+
mod linux {
13+
use libdd_otel_thread_ctx::linux::{ThreadContext, ThreadContextHandle};
14+
use std::ptr::NonNull;
15+
16+
/// Maximum size in bytes of the `attrs_data` field of a thread context record.
17+
// This is ugly, but I couldn't get cbindgen to generate the corresponding #define in any other
18+
// way. It doesn't like re-exports (pub use), and doing `pub const MAX_ATTRS_DATA_SIZE = _MAX`
19+
// (where `_MAX` has been imported properly) generates something dumb such as `#define
20+
// ddog_MAX_ATTRS_DATA_SIZE = _MAX` instead of propagating the actual value.
21+
// This solution is at least marginally better than prepending a hardcoded define manually in
22+
// build.rs, as it will at least keep the value in sync.
23+
pub const MAX_ATTRS_DATA_SIZE: usize = 612;
24+
const _: () = assert!(
25+
MAX_ATTRS_DATA_SIZE == libdd_otel_thread_ctx::linux::MAX_ATTRS_DATA_SIZE,
26+
"MAX_ATTRS_DATA_SIZE out of sync with libdd-otel-thread-ctx"
27+
);
28+
29+
/// Allocate and initialise a new thread context.
30+
///
31+
/// Returns a non-null owned handle that must eventually be released with
32+
/// `ddog_otel_thread_ctx_free`.
33+
#[no_mangle]
34+
pub extern "C" fn ddog_otel_thread_ctx_new(
35+
trace_id: &[u8; 16],
36+
span_id: &[u8; 8],
37+
local_root_span_id: &[u8; 8],
38+
) -> NonNull<ThreadContextHandle> {
39+
ThreadContext::new(*trace_id, *span_id, *local_root_span_id, &[]).into_opaque_ptr()
40+
}
41+
42+
/// Free an owned thread context.
43+
///
44+
/// # Safety
45+
///
46+
/// `ctx` must be a valid non-null pointer obtained from `ddog_otel_thread_ctx_new` or
47+
/// `ddog_otel_thread_ctx_detach`, and must not be used after this call. In particular, `ctx`
48+
/// must not be currently attached to a thread.
49+
#[no_mangle]
50+
pub unsafe extern "C" fn ddog_otel_thread_ctx_free(ctx: *mut ThreadContextHandle) {
51+
if let Some(ctx) = NonNull::new(ctx) {
52+
let _ = ThreadContext::from_opaque_ptr(ctx);
53+
}
54+
}
55+
56+
/// Attach `ctx` to the current thread. Returns the previously attached context if any, or null
57+
/// otherwise.
58+
///
59+
/// # Safety
60+
///
61+
/// `ctx` must be a valid non-null pointer obtained from this API. Ownership of `ctx` is
62+
/// transferred to the TLS slot: the caller must not drop `ctx` while it is still actively
63+
/// attached.
64+
#[no_mangle]
65+
pub unsafe extern "C" fn ddog_otel_thread_ctx_attach(
66+
ctx: *mut ThreadContextHandle,
67+
) -> Option<NonNull<ThreadContextHandle>> {
68+
ThreadContext::from_opaque_ptr(NonNull::new(ctx)?)
69+
.attach()
70+
.map(ThreadContext::into_opaque_ptr)
71+
}
72+
73+
/// Remove the currently attached context from the TLS slot.
74+
///
75+
/// Returns the detached context (caller now owns it and must release it with
76+
/// `ddog_otel_thread_ctx_free`), or null if the slot was empty.
77+
#[no_mangle]
78+
pub extern "C" fn ddog_otel_thread_ctx_detach() -> Option<NonNull<ThreadContextHandle>> {
79+
ThreadContext::detach().map(ThreadContext::into_opaque_ptr)
80+
}
81+
82+
/// Update the currently attached context in-place.
83+
///
84+
/// If no context is currently attached, one is created and attached, equivalent to calling
85+
/// `ddog_otel_thread_ctx_new` followed by `ddog_otel_thread_ctx_attach`.
86+
#[no_mangle]
87+
pub extern "C" fn ddog_otel_thread_ctx_update(
88+
trace_id: &[u8; 16],
89+
span_id: &[u8; 8],
90+
local_root_span_id: &[u8; 8],
91+
) {
92+
ThreadContext::update(*trace_id, *span_id, *local_root_span_id, &[]);
93+
}
94+
}
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
2+
// SPDX-License-Identifier: Apache-2.0
3+
4+
//! Verify ELF properties of the built cdylib on Linux.
5+
//!
6+
//! These tests check that:
7+
//! - `otel_thread_ctx_v1` is exported in the dynamic symbol table as a TLS GLOBAL symbol.
8+
//! - `otel_thread_ctx_v1` is accessed via TLSDESC relocations (R_X86_64_TLSDESC or
9+
//! R_AARCH64_TLSDESC), as required by the OTel thread-level context sharing spec.
10+
//!
11+
//! The cdylib path is derived at runtime from the test executable location.
12+
//! Both the test binary and the cdylib live in `target/<[triple/]profile>/deps/`.
13+
14+
#![cfg(target_os = "linux")]
15+
16+
use std::path::PathBuf;
17+
use std::process::Command;
18+
19+
const SYMBOL: &str = "otel_thread_ctx_v1";
20+
21+
fn cdylib_path() -> PathBuf {
22+
// test binary: target/<[triple/]profile>/deps/<name>
23+
// cdylib: target/<[triple/]profile>/deps/liblibdd_otel_thread_ctx_ffi.so
24+
let exe = std::env::current_exe().expect("failed to read current executable path");
25+
exe.parent()
26+
.expect("unexpected test executable path structure")
27+
.join("liblibdd_otel_thread_ctx_ffi.so")
28+
}
29+
30+
fn check_cdylib_readable(path: &PathBuf) {
31+
assert!(
32+
std::fs::File::open(path).is_ok(),
33+
"cdylib at {} could not be opened for reading",
34+
path.display()
35+
);
36+
}
37+
38+
fn readelf(args: &[&str], path: &PathBuf) -> String {
39+
let out = Command::new("readelf")
40+
.args(args)
41+
.arg(path)
42+
.output()
43+
.expect("failed to run readelf. Is binutils installed?");
44+
String::from_utf8_lossy(&out.stdout).into_owned()
45+
}
46+
47+
#[test]
48+
#[cfg_attr(miri, ignore)]
49+
fn otel_thread_ctx_v1_in_dynsym() {
50+
let path = cdylib_path();
51+
check_cdylib_readable(&path);
52+
let output = readelf(&["-W", "--dyn-syms"], &path);
53+
let line = output
54+
.lines()
55+
.find(|l| l.contains(SYMBOL))
56+
.unwrap_or_else(|| panic!("'{SYMBOL}' not found in dynsym of {}", path.display()));
57+
assert!(
58+
line.contains("TLS") && line.contains("GLOBAL"),
59+
"'{SYMBOL}' is in dynsym but not as TLS GLOBAL — got:\n {line}"
60+
);
61+
}
62+
63+
#[test]
64+
#[cfg_attr(miri, ignore)]
65+
fn otel_thread_ctx_v1_tlsdesc_reloc() {
66+
let path = cdylib_path();
67+
check_cdylib_readable(&path);
68+
let output = readelf(&["-W", "--relocs"], &path);
69+
let found = output.lines().any(|l| {
70+
l.contains(SYMBOL) && (l.contains("R_X86_64_TLSDESC") || l.contains("R_AARCH64_TLSDESC"))
71+
});
72+
assert!(
73+
found,
74+
"No TLSDESC relocation found for '{SYMBOL}' in {}\n\
75+
All relocations mentioning the symbol:\n{}",
76+
path.display(),
77+
output
78+
.lines()
79+
.filter(|l| l.contains(SYMBOL))
80+
.map(|l| format!(" {l}"))
81+
.collect::<Vec<_>>()
82+
.join("\n")
83+
);
84+
}
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
global: otel_thread_ctx_v1;
3+
};

0 commit comments

Comments
 (0)