-
Notifications
You must be signed in to change notification settings - Fork 762
[WIP] Support tensorzero_reasoning_content for OpenAI-compatible endpoint
#5992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9e02ab4728
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ContentBlockChunk::Thought(thought) => { | ||
| let len = thought_id_to_index.len(); | ||
| let index = *thought_id_to_index.entry(thought.id.clone()).or_insert(len); | ||
| thoughts.push(OpenAICompatibleThoughtChunk { | ||
| index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Track thought indices in streaming by full content order
In process_chat_content_chunk the thought index is derived solely from the count of distinct thought IDs, but the struct comment and openai_messages_to_input semantics treat this index as the position in the full content array (text/tool calls included). This means any stream where a text/tool-call block appears before the first thought will produce index = 0 even though the actual content position should be >0, so clients that round-trip tensorzero_reasoning_content will mis-order thoughts relative to text/tool calls (e.g., a text chunk followed by a thought chunk will be reconstructed as thought-first). Consider tracking indices based on overall content block order (like the non-streaming path) rather than only thought ID order.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| #[expect(clippy::ref_option)] | ||
| fn is_none_or_empty<T>(v: &Option<Vec<T>>) -> bool { | ||
| v.as_ref().is_none_or(Vec::is_empty) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated is_none_or_empty helper function across modules
Low Severity
The is_none_or_empty function is being added to chat_completions.rs but an identical function already exists in streaming.rs. Both functions have the same signature, implementation (v.as_ref().is_none_or(Vec::is_empty)), and even the same #[expect(clippy::ref_option)] attribute. This duplication creates a maintenance burden where any future bug fix would need to be applied to both locations.


Note
Medium Risk
Expands the OpenAI-compatible request/response schema and changes how assistant message content is assembled/serialized (including index-based insertion), which could affect downstream clients relying on prior omission/ordering.
Overview
Adds optional
tensorzero_reasoning_contentto OpenAI-compatible assistant messages and surfaces internalThought/ThoughtChunkblocks in both non-streaming and streaming responses (with stable per-content indices).Updates conversion logic to insert incoming thoughts into assistant content using explicit indices (and prepends unindexed thoughts), and adjusts serialization to omit
tool_calls/tensorzero_reasoning_contentwhen empty rather than emitting empty arrays; includes new serde-flattened thought types and tests for ordering/serialization.Written by Cursor Bugbot for commit 9e02ab4. This will update automatically on new commits. Configure here.