Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(deepseek): preserve reasoning_content for V4 thinking-mode multi-turn#165

Merged
Kuberwastaken merged 1 commit into
Kuberwastaken:mainfrom
jhult:fix/deepseek-reasoning-content
May 23, 2026
Merged

fix(deepseek): preserve reasoning_content for V4 thinking-mode multi-turn#165
Kuberwastaken merged 1 commit into
Kuberwastaken:mainfrom
jhult:fix/deepseek-reasoning-content

Conversation

@jhult

@jhult jhult commented May 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes DeepSeek V4 multi-turn tool-call failures by properly preserving reasoning_content through the streaming pipeline. DeepSeek V4's thinking mode requires reasoning_content to be echoed back on subsequent turns with tool calls, or the API returns a 400 error.

Related

Based on PR #111 which identified the issue and provided the fix approach.

Changes

  1. openai_compat.rs stream loop: Opens a dedicated Thinking block (using reserved index) on first reasoning delta and closes it before text/tool_calls/finish_reason. Ensures reasoning deltas are accumulated instead of silently dropped.

  2. Token optimization: Only includes reasoning_content for providers that require it (currently DeepSeek V4). Added requires_reasoning_roundtrip quirk flag to gate inclusion, preventing wasted tokens on providers like OpenAI, Groq, Azure that ignore the field.

  3. Provider registration: DeepSeek provider sets requires_reasoning_roundtrip: true.

Testing

  • ✅ Compiles without errors
  • ✅ Block index strategy verified (uses usize::MAX - 100 to avoid collisions)
  • ✅ Thinking lifecycle management correct (open on first delta, close before other content types)
  • ✅ Backward compatible (field only included when needed)

Notes

Research confirms this requirement is unique to DeepSeek V4; no other major LLM provider has multi-turn reasoning round-trip requirements.

…turn

DeepSeek V4 models enable thinking mode by default and stream chain-of-thought as `reasoning_content`. Per the API contract, any assistant turn that produced a tool call must have its `reasoning_content` echoed back on subsequent turns, otherwise the server returns a 400 error. The OpenAI-compatible provider path was not preserving reasoning in multi-turn tool flows, causing all interactions with V4 to fail.

## Changes

1. **openai_compat.rs stream loop**: Now opens a dedicated Thinking block (index usize::MAX - 100) on first reasoning delta and closes it before text/tool_calls/finish_reason. This ensures reasoning deltas are properly accumulated instead of being silently dropped.

2. **openai_compat.rs message building**: Only includes reasoning_content for providers that require it (currently DeepSeek V4). Added `requires_reasoning_roundtrip` quirk flag to gate this inclusion. Prevents wasting tokens on providers that ignore the field (OpenAI, Groq, Azure, etc.).

3. **Provider registration**: DeepSeek provider explicitly sets `requires_reasoning_roundtrip: true`.

This implementation is based on PR Kuberwastaken#111 (Kuberwastaken#111) which identified and fixed the issue. Research shows this requirement is unique to DeepSeek V4; no other major LLM provider has this multi-turn reasoning round-trip requirement.
@jhult

jhult commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

Fixes #121

@Kuberwastaken

Copy link
Copy Markdown
Owner

LGTM

@Kuberwastaken Kuberwastaken merged commit 653c905 into Kuberwastaken:main May 23, 2026
@jhult jhult deleted the fix/deepseek-reasoning-content branch May 25, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants