Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Nemo157
Copy link
Contributor

@Nemo157 Nemo157 commented Oct 10, 2025

See the individual commit descriptions for full context.

Overall this is removing the traces + root spans from the telemetry protocol, and moving the resolution of "implicit parents" to the consumer side rather than the producer. To allow the consumer to correctly track what is the implicit parent for events the execution id now needs to mix-in a thread-id. By having just this one id (which is queried via the OSAL) as part of the output data we avoid needing any other thread-local state within the producer so we can use it on systems that don't provide any.

This involves breaking changes to both the veecle-telemetry crate API and the JSON encoding.

Closes: DEV-911, DEV-913

@github-actions
Copy link

github-actions bot commented Oct 10, 2025

@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch from 11a89b2 to 739c5f9 Compare October 10, 2025 14:21
@Nemo157 Nemo157 marked this pull request as ready for review October 10, 2025 14:29
@claude
Copy link

claude bot commented Oct 10, 2025

Change Summary

This PR removes the TraceId concept and thread-local state from the telemetry system. The SpanContext now uses ProcessId directly for span identification. The execution ID has been changed to include both a process ID and thread ID to uniquely identify thread/task combinations, allowing the consumer to track implicit parent spans without requiring thread-local state on the producer side. This enables telemetry on systems without thread-local storage support.

The changes include breaking API changes to the veecle-telemetry crate and modifications to the JSON encoding format. All examples and runtime code have been updated to use the new ProcessId instead of ExecutionId when setting exporters.


Issues Found

🟡 Style Guide - Inconsistent Formatting in ProcessId::Display

The Display implementation for ProcessId (veecle-telemetry/src/id.rs:48) uses a format string that may not produce consistent output across all values.

Location: veecle-telemetry/src/id.rs:48

Current code:

write!(f, "{:016x}", self.0)

Issue: A ProcessId is a u128 but the format string only shows 16 hex digits (which represents 64 bits). This will truncate values larger than u64::MAX.

Expected: The format string should use 32 hex digits to fully represent the 128-bit value:

write!(f, "{:032x}", self.0)

This is consistent with the serialization implementation which correctly uses all 16 bytes (32 hex chars) of the u128.

@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 70.26316% with 113 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
veecle-telemetry-ui/src/store/mod.rs 0.00% 57 Missing ⚠️
veecle-telemetry/src/id.rs 77.03% 25 Missing and 6 partials ⚠️
veecle-telemetry/src/protocol.rs 60.31% 20 Missing and 5 partials ⚠️

📢 Thoughts on this report? Let us know!

@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch from 739c5f9 to 4341f4e Compare October 10, 2025 14:36
self.borrow_mut().take()
pub(crate) fn take(
&self,
#[cfg(feature = "veecle-telemetry")] span_context: Option<SpanContext>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very unfortunate, but unavoidable I suppose

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I would get rid of the feature and just always have this code, since it's supposed to be zero-cost without veecle-telemetry/enable. But that's separate from these changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about passing down the SpanContext, I'd really prefer a solution where this isn't necessary (thread locals and alternatives on no_std) because it makes some things very awkward, but I guess this is what we decided on doing

root []
+ attr: runtime_attr="added_later"
+ link: trace=123456789abcdef0, span=fedcba9876543210
+ link: span=fedcba9876543210

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be printing the process id now?

@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch 2 times, most recently from 8175512 to af394c0 Compare October 13, 2025 10:54
@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch 2 times, most recently from 57070a8 to 8b1b3b3 Compare October 14, 2025 09:49
@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch 2 times, most recently from 88f862d to 969a3f1 Compare October 15, 2025 08:53
/// An identifier for a trace, which groups a set of related spans together.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct TraceId(pub u128);
/// A globally-unique id identifying a process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of the name because it only really fits on std?

ExecutionId kinda was the general name for this "an execution of code happening somewhere"

But I guess if it's just this name it shouldn't block the PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was supposed to be on ProcessId

Do we even need it or can we just add ThreadId to the InstanceMessage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A span can be entered+exited from many threads on the same process (e.g. if it's part of a future that gets stolen within a multi-threaded tokio executor).

Copy link

@ForsakenHarmony ForsakenHarmony Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I'd just like to not have both ExecutionId and ProcessId, so I'm thinking we can probably remove ProcessId again and put ThreadId next to ExecutionId where necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's much easier to think about the context-tracking in the UI when you have a name for the "thread of execution", instead of having a HashMap<(ProcessId, ThreadId), _>, and I felt ExecutionId fit that better than it did the process.

One other option would be to not name the integer for the threads, and have

struct ThreadId {
  process: ProcessId,
  thread: u64,
}

to only need to come up with two names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess thinking about it ThreadId also doesn't make sense on no_std targets

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess nesting ExecutionId inside whatever we want to call ThreadId would always make it globally unique as well which might be nice

Copy link
Contributor Author

@Nemo157 Nemo157 Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to pin it down, the two things we're identifying are:

  • the global memory space
  • the call stack

These don't really give great names, so personally I think using the standard OS names for these works fine. They're probably instantly understandable to most devs and are easy to map to other systems (freertos process=reset thread=task)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that having this separate together is needed?
Why don't we split into nibbles? high nibble process id and lower one thread id?
So it is still a newtype but can be carried anywhere. I am not sure it can be more than 2^32 thread id can happen for a single process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved to #90

@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch from 85148d1 to 52aadd1 Compare October 22, 2025 12:33
The collector is now initialized with a process id and automatically
combines this with a per-thread id from the OSAL to create a globally
unique id for the current thread/task.

Closes: DEV-913
Signed-off-by: Wim Looman <[email protected]>
…ntification

Remove the `TraceId` concept and use `ProcessId` directly to identify
the context within which `SpanId`s are unique.

Previously, `TraceId` was generated from `ProcessId` using a counter,
but this added complexity without clear benefit and requires thread
local state. `SpanId`s are now unique within a process, and the
combination of `ProcessId` + `SpanId` provides global uniqueness through
`SpanContext`.

Signed-off-by: Wim Looman <[email protected]>
Remove the thread-local `CURRENT_SPAN` tracking and
`SpanContext::current` method. Span context is now determined by
execution id (process + thread) rather than explicit parent-child span
relationships.

This simplifies the telemetry model by eliminating implicit state within
the process. Span messages are correlated through their execution id,
which provides sufficient context for external tools to reconstruct the
span relationships.

Closes: DEV-911
Signed-off-by: Wim Looman <[email protected]>
Add custom `Display`, `FromStr`, `Serialize`, and `Deserialize`
implementations for `ProcessId`, `ThreadId`, `ExecutionId`, and
`SpanContext` types.

These provide a consistent hex-encoded string format with colon
separators for composite IDs (`process:thread` for `ExecutionId`,
`process:span` for `SpanContext`). This makes telemetry ids more
readable and provides a unified format for logging and serialization.

Signed-off-by: Wim Looman <[email protected]>
@Nemo157 Nemo157 force-pushed the wim/push-pxywosxxsluy branch from 52aadd1 to d2ab38d Compare October 23, 2025 09:33
@Nemo157 Nemo157 closed this Oct 23, 2025
@Nemo157 Nemo157 deleted the wim/push-pxywosxxsluy branch October 23, 2025 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants