fix: use max_completion_tokens for GPT-5 models in LiteLLM provider#6980
fix: use max_completion_tokens for GPT-5 models in LiteLLM provider#6980
Conversation
- GPT-5 models require max_completion_tokens instead of the deprecated max_tokens parameter - Added detection for GPT-5 model variants (gpt-5, gpt5, GPT-5, etc.) - Updated both createMessage and completePrompt methods to handle GPT-5 models - Added comprehensive tests for GPT-5 model handling Fixes #6979
src/api/providers/lite-llm.ts
Outdated
| let maxTokens: number | undefined = info.maxTokens ?? undefined | ||
|
|
||
| // Check if this is a GPT-5 model that requires max_completion_tokens instead of max_tokens | ||
| const isGPT5Model = modelId.toLowerCase().includes("gpt-5") || modelId.toLowerCase().includes("gpt5") |
There was a problem hiding this comment.
Consider extracting the GPT-5 model detection logic into a shared helper function. This logic (using modelId.toLowerCase().includes('gpt-5') || modelId.toLowerCase().includes('gpt5')) appears in both createMessage and completePrompt, and centralizing it would improve maintainability.
This comment was generated because it violated a code review rule: irule_tTqpIuNs8DV0QFGj.
src/api/providers/lite-llm.ts
Outdated
| let maxTokens: number | undefined = info.maxTokens ?? undefined | ||
|
|
||
| // Check if this is a GPT-5 model that requires max_completion_tokens instead of max_tokens | ||
| const isGPT5Model = modelId.toLowerCase().includes("gpt-5") || modelId.toLowerCase().includes("gpt5") |
There was a problem hiding this comment.
The model detection logic could be more precise. Currently, modelId.toLowerCase().includes("gpt-5") would match unintended models like "not-gpt-5000". Consider using a more specific pattern:
| const isGPT5Model = modelId.toLowerCase().includes("gpt-5") || modelId.toLowerCase().includes("gpt5") | |
| // Check if this is a GPT-5 model that requires max_completion_tokens instead of max_tokens | |
| const modelLower = modelId.toLowerCase() | |
| const isGPT5Model = modelLower.startsWith("gpt-5") || modelLower.startsWith("gpt5") || modelLower === "gpt5" |
src/api/providers/lite-llm.ts
Outdated
| const { id: modelId, info } = await this.fetchModel() | ||
|
|
||
| // Check if this is a GPT-5 model that requires max_completion_tokens instead of max_tokens | ||
| const isGPT5Model = modelId.toLowerCase().includes("gpt-5") || modelId.toLowerCase().includes("gpt5") |
There was a problem hiding this comment.
This detection logic is duplicated from line 111. Would it be cleaner to extract this into a helper method to maintain DRY principles? Something like:
private isGPT5Model(modelId: string): boolean {
const modelLower = modelId.toLowerCase()
return modelLower.startsWith("gpt-5") || modelLower.startsWith("gpt5") || modelLower === "gpt5"
}
src/api/providers/lite-llm.ts
Outdated
|
|
||
| // GPT-5 models require max_completion_tokens instead of the deprecated max_tokens parameter | ||
| if (isGPT5Model && maxTokens) { | ||
| // @ts-ignore - max_completion_tokens is not in the OpenAI types yet but is supported |
There was a problem hiding this comment.
Is there a way to avoid using @ts-ignore here? Could we extend the OpenAI types or create a custom interface that includes max_completion_tokens to maintain type safety? For example:
interface GPT5RequestOptions extends Omit<OpenAI.Chat.Completions.ChatCompletionCreateParamsStreaming, 'max_tokens'> {
max_completion_tokens?: number
max_tokens?: never
}| }) | ||
|
|
||
| it("should use max_completion_tokens for various GPT-5 model variations", async () => { | ||
| const gpt5Variations = ["gpt-5", "gpt5", "GPT-5", "gpt-5-turbo", "gpt5-preview"] |
There was a problem hiding this comment.
Great test coverage! Consider adding edge cases like mixed case variations ("GpT-5", "gPt5") or models with additional suffixes ("gpt-5-32k", "gpt-5-vision") to ensure the detection works correctly for all possible GPT-5 model names.
|
This can be merged after #7067 since this PR will update the OpenAI SDK, allowing us to use it without the type error suppression comments. |
There was a problem hiding this comment.
Pull Request Overview
This PR fixes an issue where GPT-5 models fail with LiteLLM due to using the deprecated max_tokens parameter instead of max_completion_tokens. The fix adds detection for GPT-5 model variants and conditionally uses the correct parameter based on the model type.
- Adds GPT-5 model detection logic to identify models requiring
max_completion_tokens - Updates both
createMessageandcompletePromptmethods to use the appropriate parameter - Maintains backward compatibility for non-GPT-5 models
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/api/providers/lite-llm.ts | Implements GPT-5 model detection and conditional parameter usage |
| src/api/providers/tests/lite-llm.spec.ts | Adds comprehensive test coverage for GPT-5 handling and refactors mocking |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/api/providers/lite-llm.ts
Outdated
| let maxTokens: number | undefined = info.maxTokens ?? undefined | ||
|
|
||
| // Check if this is a GPT-5 model that requires max_completion_tokens instead of max_tokens | ||
| const isGPT5Model = modelId.toLowerCase().includes("gpt-5") || modelId.toLowerCase().includes("gpt5") |
There was a problem hiding this comment.
The GPT-5 model detection logic is duplicated between the createMessage and completePrompt methods. Consider extracting this into a private helper method to improve maintainability and ensure consistency.
src/api/providers/lite-llm.ts
Outdated
| // @ts-ignore - max_completion_tokens is not in the OpenAI types yet but is supported | ||
| requestOptions.max_completion_tokens = maxTokens |
There was a problem hiding this comment.
Using @ts-ignore to suppress TypeScript errors for max_completion_tokens is not ideal. Consider using type assertion with a more specific interface or extending the OpenAI types to include this property for better type safety.
src/api/providers/lite-llm.ts
Outdated
| // @ts-ignore - max_completion_tokens is not in the OpenAI types yet but is supported | ||
| requestOptions.max_completion_tokens = info.maxTokens |
There was a problem hiding this comment.
This is a duplicate of the previous @ts-ignore comment. The same type safety concerns apply here - consider using a consistent approach to handle the missing type definition.
This PR fixes the issue where GPT-5 models fail with LiteLLM due to using the deprecated
max_tokensparameter instead ofmax_completion_tokens.Problem
When using GPT-5 models with LiteLLM, users encounter the error:
Solution
createMessageandcompletePromptmethods to usemax_completion_tokensfor GPT-5 modelsmax_tokensfor non-GPT-5 modelsTesting
max_tokensas beforeFixes #6979
Important
Fixes GPT-5 model issue in LiteLLM by using
max_completion_tokensinstead ofmax_tokens, with tests for various model names.LiteLLMHandlerby usingmax_completion_tokensinstead of deprecatedmax_tokens.createMessageandcompletePromptmethods inlite-llm.ts.max_tokens.lite-llm.spec.tsfor GPT-5 model handling.max_tokens.isGpt5()function to detect GPT-5 model variants inlite-llm.ts.This description was created by
for 822886b. You can customize this summary. It will automatically update as commits are pushed.