Refactor chat model health check to lower tokens usage for reasoning models #7317

ewgenius · 2025-09-28T09:01:18Z

📝 Summary

Problem: Some OpenAI models (GPT-5) may trigger reasoning during the spicepod model loader health check. This can lead to increased total token usage, reaching the specified max_tokens limit and causing the model to fail to load, preventing the spicepod from being ready.

Chat health checks were adjusted to apply reasoning_effort: "low" for the gpt-5*, o3*, and o4* models.
max_tokens parameter increased to 300.

Before:

model	total_tokens
gpt-5-nano	94
gpt-5-mini	30
gpt-5	94
gpt-4o	22
gpt-4o-mini	22
gpt-4.1-nano	22
gpt-4.1-mini	22
gpt-4.1	22
o1	99
o1-mini	221
o3	39
o3-mini	220
o4-mini	103

After:

model	total_tokens	note
gpt-5-nano	24
gpt-5-mini	24
gpt-5	24
gpt-4o	16
gpt-4o-mini	16
gpt-4.1-nano	16
gpt-4.1-mini	16
gpt-4.1	16
o1	214	doesn't support `reasoning_effort`
o1-mini	155	doesn't support `reasoning_effort`
o3	33
o3-mini	156	doesn't support `reasoning_effort`
o4-mini	33

🔗 Related

🚨 Breaking Changes

📚 Docs

👀 Notes for Reviewers

…models

github-actions · 2025-09-28T09:01:33Z

✅ Pull with Spice Passed

Passing checks:

✅ Title meets minimum length requirement (10 characters)
✅ Has at least one of the required labels: kind/refactor, kind/bug, kind/enhancement, kind/documentation, kind/optimization, kind/dependencies, kind/endgame
✅ No banned labels detected
✅ Has at least one assignee: ewgenius

Copilot

Pull Request Overview

This PR optimizes the health check process for OpenAI chat models to reduce token usage for reasoning models. The changes specifically target models that may trigger reasoning during health checks (GPT-5, O3, O4), which could cause excessive token consumption and model loading failures.

Added reasoning effort control for reasoning-capable models
Increased max_tokens limit from 150 to 300 tokens
Enhanced health check logging and response handling

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
crates/llms/src/openai/mod.rs	Added `supports_reasoning_effort()` method to identify models that support reasoning effort parameter
crates/llms/src/openai/chat.rs	Updated health check to use low reasoning effort for supported models and increased token limits

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/llms/src/openai/chat.rs

…-prompt

Copilot

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/llms/src/openai/chat.rs

Copilot

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/llms/src/openai/chat.rs

Co-authored-by: Copilot <[email protected]>

…-prompt

Copilot

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

…models (#7317) * Refactor chat model health check to lower tokens usage for reasoning models * Fix comment * Update crates/llms/src/openai/chat.rs * Update crates/llms/src/openai/chat.rs Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Luke Kim <[email protected]> Co-authored-by: Copilot <[email protected]>

Refactor chat model health check to lower tokens usage for reasoning …

5993f07

…models

ewgenius requested a review from a team as a code owner September 28, 2025 09:01

ewgenius requested review from Copilot and removed request for a team September 28, 2025 09:01

Copilot AI reviewed Sep 28, 2025

View reviewed changes

crates/llms/src/openai/chat.rs Outdated Show resolved Hide resolved

crates/llms/src/openai/chat.rs Outdated Show resolved Hide resolved

ewgenius self-assigned this Sep 28, 2025

ewgenius added the kind/bug Something isn't working label Sep 28, 2025

ewgenius requested review from a team and Jeadie September 28, 2025 09:02

ewgenius added 2 commits September 28, 2025 18:05

Merge branch 'trunk' into evgenii/2025-09-25/tweak-openai-healthcheck…

2a11416

…-prompt

Fix comment

bb5b160

Copilot AI review requested due to automatic review settings September 28, 2025 09:55

Copilot AI reviewed Sep 28, 2025

View reviewed changes

Jeadie previously approved these changes Sep 28, 2025

View reviewed changes

lukekim reviewed Sep 28, 2025

View reviewed changes

crates/llms/src/openai/chat.rs Outdated Show resolved Hide resolved

Update crates/llms/src/openai/chat.rs

e48b130

lukekim dismissed Jeadie’s stale review via e48b130 September 28, 2025 19:07

Copilot AI review requested due to automatic review settings September 28, 2025 19:07

lukekim previously approved these changes Sep 28, 2025

View reviewed changes

lukekim enabled auto-merge September 28, 2025 19:07

Copilot AI reviewed Sep 28, 2025

View reviewed changes

crates/llms/src/openai/chat.rs Show resolved Hide resolved

crates/llms/src/openai/chat.rs Outdated Show resolved Hide resolved

Update crates/llms/src/openai/chat.rs

e66268a

Co-authored-by: Copilot <[email protected]>

lukekim dismissed their stale review via e66268a September 28, 2025 19:54

Copilot AI review requested due to automatic review settings September 28, 2025 19:54

Merge branch 'trunk' into evgenii/2025-09-25/tweak-openai-healthcheck…

9034de2

…-prompt

Copilot AI reviewed Sep 28, 2025

View reviewed changes

Jeadie approved these changes Sep 28, 2025

View reviewed changes

lukekim added this pull request to the merge queue Sep 28, 2025

Merged via the queue into trunk with commit 634eda8 Sep 28, 2025
79 of 80 checks passed

lukekim deleted the evgenii/2025-09-25/tweak-openai-healthcheck-prompt branch September 28, 2025 23:55

phillipleblanc added this to the v1.7.1 milestone Sep 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor chat model health check to lower tokens usage for reasoning models #7317

Refactor chat model health check to lower tokens usage for reasoning models #7317

ewgenius commented Sep 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 28, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor chat model health check to lower tokens usage for reasoning models #7317

Refactor chat model health check to lower tokens usage for reasoning models #7317

Conversation

ewgenius commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

🔗 Related

🚨 Breaking Changes

📚 Docs

👀 Notes for Reviewers

Uh oh!

github-actions bot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Pull with Spice Passed

Passing checks:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ewgenius commented Sep 28, 2025 •

edited

Loading

github-actions bot commented Sep 28, 2025 •

edited

Loading