Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@BenjaminMichaelis
Copy link
Contributor

The OpenAI API has deprecated max_tokens in favor of max_completion_tokens for newer models. This change updates both text and image model calls.

Fixes: #1969

API Spec: https://platform.openai.com/docs/api-reference/chat/create

…kens

The OpenAI API has deprecated max_tokens in favor of max_completion_tokens
for newer models. This change updates both text and image model calls.
@coderabbitai
Copy link

coderabbitai bot commented Oct 1, 2025

Walkthrough

Replaced use of max_tokens with a conditional max_completion_tokens in inference calls: when serverConfig.inference.useMaxCompletionTokens is true the code emits max_completion_tokens = serverConfig.inference.maxOutputTokens; otherwise it continues to emit max_tokens = serverConfig.inference.maxOutputTokens. Control flow and response_format handling unchanged.

Changes

Cohort / File(s) Change summary
Inference params update
packages/shared/inference.ts
Conditionalized OpenAI completion parameter: emit max_completion_tokens when serverConfig.inference.useMaxCompletionTokens is true; otherwise emit max_tokens. Uses serverConfig.inference.maxOutputTokens as the value in both branches.
Server config schema & env
packages/shared/config.ts
Added environment flag INFERENCE_USE_MAX_COMPLETION_TOKENS (stringBool("false")) and exposed it as useMaxCompletionTokens on serverConfig.inference.
Docs update
docs/docs/03-configuration.md
Added INFERENCE_USE_MAX_COMPLETION_TOKENS entry under Inference Configs with default false and description.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Inference as inference.ts
    participant OpenAI

    Client->>Inference: request inferFromText / inferFromImage
    Note right of Inference #DDEBF7: Read serverConfig.inference\n(useMaxCompletionTokens, maxOutputTokens)
    alt useMaxCompletionTokens == true
        Inference->>OpenAI: send completion with\n`max_completion_tokens = maxOutputTokens`
    else useMaxCompletionTokens == false
        Inference->>OpenAI: send completion with\n`max_tokens = maxOutputTokens`
    end
    OpenAI-->>Inference: completion response
    Inference-->>Client: return response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "fix: update OpenAI API to use max_completion_tokens instead of max_tokens" is related to the changeset and addresses a key functional aspect of the changes. The code does update the OpenAI parameter handling to support max_completion_tokens, though the actual implementation adds a configurable flag that conditionally switches between the two parameters rather than a complete replacement. The title captures the primary functional change even if it doesn't fully convey the conditional/backward-compatible nature of the implementation.
Linked Issues Check ✅ Passed The PR implements the requirements from issue #1969 and incorporates the backward-compatibility approach suggested in the discussion comments. The changes successfully address the core objectives: enabling support for max_completion_tokens to fix the gpt-5-nano compatibility error (#1969), while maintaining backward compatibility through the new INFERENCE_USE_MAX_COMPLETION_TOKENS config flag. Users can enable this flag when using models that require max_completion_tokens, and the default behavior remains unchanged for existing deployments. The implementation updates both text and image inference calls as required.
Out of Scope Changes Check ✅ Passed All changes in the PR are directly related to the stated objectives and remain within scope. The three modified files (inference.ts, config.ts, and configuration.md) collectively implement the configurable parameter switching logic, expose the necessary environment variable, and document the new configuration option. No unrelated changes, refactoring, or side modifications are present. Each change serves a clear purpose in achieving backward-compatible support for max_completion_tokens.
Description Check ✅ Passed The PR description clearly explains the context for the change ("OpenAI API has deprecated max_tokens in favor of max_completion_tokens for newer models") and describes what is being updated ("both text and image model calls"). It references the linked issue (#1969) and provides the OpenAI API documentation reference. The description is directly related to the changeset and provides sufficient context for understanding the motivation behind the changes.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b92ed2a and 83514a4.

📒 Files selected for processing (3)
  • docs/docs/03-configuration.md (1 hunks)
  • packages/shared/config.ts (2 hunks)
  • packages/shared/inference.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/shared/inference.ts
🧰 Additional context used
📓 Path-based instructions (3)
docs/docs/03-configuration.md

📄 CodeRabbit inference engine (AGENTS.md)

Project configuration options are documented in docs/docs/03-configuration.md

Files:

  • docs/docs/03-configuration.md
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use oxlint as the linter for JavaScript/TypeScript code

Files:

  • packages/shared/config.ts
packages/shared/**

📄 CodeRabbit inference engine (AGENTS.md)

Shared code and types reside in packages/shared

Files:

  • packages/shared/config.ts
🔇 Additional comments (3)
packages/shared/config.ts (2)

68-68: LGTM! Environment variable properly configured.

The new flag is correctly defined with backward-compatible default behavior (false = use legacy max_tokens parameter). The placement among other inference config variables is appropriate.


229-229: LGTM! Config mapping is correct.

The environment variable is properly mapped to the inference config object with appropriate camelCase naming convention.

docs/docs/03-configuration.md (1)

98-98: Implementation verified and complete.

All verification checkpoints pass:

  • Flag is properly checked in both text and image inference calls (lines 93 and 132)
  • When true: emits max_completion_tokens with maxOutputTokens value (lines 94, 133)
  • When false: emits max_tokens for backward compatibility (lines 95, 134)
  • Both OpenAI chat completion calls handle the conditional correctly

The documentation accurately reflects the implementation.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e331a7 and b92ed2a.

📒 Files selected for processing (1)
  • packages/shared/inference.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use oxlint as the linter for JavaScript/TypeScript code

Files:

  • packages/shared/inference.ts
packages/shared/**

📄 CodeRabbit inference engine (AGENTS.md)

Shared code and types reside in packages/shared

Files:

  • packages/shared/inference.ts
🔇 Additional comments (1)
packages/shared/inference.ts (1)

130-130: Consistent parameter update for image inference.

The change mirrors the update in inferFromText (line 93), ensuring consistency across both text and image inference methods.

@Theelx
Copy link

Theelx commented Oct 4, 2025

Can this be looked at and hopefully merged? It would be really nice to get this merged so I can use gpt-5-mini for summarization instead of being stuck on gpt-4.1-mini, especially since gpt-5-mini is only a bit over half the price for input tokens.

@NeodymiumPhish
Copy link

Is there anything blocking this fix? I updated my instance to use gpt-5-nano after adding more API credits, and I'd prefer to leave it on gpt-5-nano, but I really like the AI summarization and tagging features.

@MohamedBassem
Copy link
Collaborator

hey folks, sorry for taking too long to get back to this. Biggest concern with this PR is what it'll break. People use the openai client for almost all other providers, and I don't know which providers wouldn't support this. So here's my suggestion. Can we add a new env variable that controls which field to pass, and default it as the old behavior for now, and then in the next release we switch the default after giving people a headsup in the release notes?

@NeodymiumPhish
Copy link

This makes sense to me. Might be best to just leave a legacy_max_tokens flag for the foreseeable, since surely a lot of other projects aren't going to rush to update just to match this against OpenAI.

@BenjaminMichaelis
Copy link
Contributor Author

@MohamedBassem Took a crack at an update - decided to go with a variable that controls the field to pass vs having 2 token fields for now as that seemed a bit clearer to me, and less risk of someone setting both or similar and having to explain what happens in those cases.

@MohamedBassem MohamedBassem merged commit 046c29d into karakeep-app:main Oct 25, 2025
5 of 6 checks passed
@MohamedBassem
Copy link
Collaborator

Thanks a lot @BenjaminMichaelis, and sorry once more for the delay :) This will be in the nightly image in around 30mins.

@NeodymiumPhish
Copy link

When should we look for this in the regular release? I've added the INFERENCE_USE_MAX_COMPLETION_TOKENS env variable and set it to true, but I still get "Something went wrong" when attempting to Summarize a bookmark. I re-pulled images after making the change, so I'm assuming it's only on the nightly builds and that's why I'm still getting an error.

@Eragos
Copy link
Contributor

Eragos commented Nov 4, 2025

Should be in the next release

@BenjaminMichaelis BenjaminMichaelis deleted the fix/openai-max-tokens-parameter branch November 4, 2025 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_tokens has been depreceated by OpenAI

5 participants