-
Notifications
You must be signed in to change notification settings - Fork 0
Enforce prompt window and trim transcripts deterministically #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add TrimMessagesToFit and PromptTokenBudget to deterministically fit chat transcripts within the model context, pinning system/developer, dropping oldest non‑pinned, and truncating pinned content when necessary. Wire trimming into pre‑stage and main agent loop before requests, and add unit tests. Fixes #75.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements deterministic prompt trimming to prevent token limit exceeded errors by adding a systematic approach to fit conversation transcripts within model context windows while preserving critical messages.
- Introduces
TrimMessagesToFit
function with a clear policy: pin system/developer messages, drop oldest non-pinned messages first, and truncate content proportionally when needed - Adds
PromptTokenBudget
to calculate safe prompt limits considering completion token reservations - Enforces trimming in both main agent and pre-stage request paths to prevent API errors
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
internal/oai/trim.go | Core trimming implementation with deterministic policy for message preservation and content truncation |
internal/oai/trim_test.go | Comprehensive unit tests covering trimming behavior, edge cases, and policy validation |
internal/oai/context_window.go | Utility function to calculate prompt token budget with safety margins |
cmd/agentcli/run_agent.go | Integration of trimming logic in main agent request path |
cmd/agentcli/prestage.go | Integration of trimming logic in pre-stage request path |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Status: Rebased on latest main, tests are green locally. Proceeding with review follow-ups if any. |
Check out the review comment from copilot and implement it, @heusalagroupbot |
1 similar comment
Check out the review comment from copilot and implement it, @heusalagroupbot |
Rebase: rebased cleanly onto origin/main; pushed: no; tests: passed. |
Try again @heusalagroupbot. Check out the review comment from copilot review and implement it. |
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
Summary
TrimMessagesToFit
to fit prompts within model window while preserving system/dev, dropping oldest non‑pinned, and truncating pinned content as needed.PromptTokenBudget
and use it to reserve space for the completion.Context
Addresses
Input tokens exceed the configured limit of 272000 tokens
error in #75 by ensuring requests respect model context limits with a clear, deterministic policy.Scope
Test plan
go test ./...
passes locally.internal/oai/trim_test.go
validate behavior.Closes #75.