Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Pass --keep to llama-server #14120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2025
Merged

Pass --keep to llama-server #14120

merged 1 commit into from
Jun 11, 2025

Conversation

MightyAlex200
Copy link
Contributor

This is just a super simple 2 line change to llama-server that passes the value from the --keep command line flag to the server slots as keep_n. This allows llama-server to implement StreamingLLM properly, increasing model coherency on long contexts when context shift is enabled. All proper handling for this was already in the code, merely unused, so I think this should be considered a bug fix.

@ggerganov ggerganov merged commit 2baf077 into ggml-org:master Jun 11, 2025
45 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants