Closed
Description
Name and Version
$ ./build/bin/llama-server --version
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (12th Gen Intel(R) Core(TM) i7-12800H)
version: 4501 (667d7284)
built with cc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) for x86_64-redhat-linux
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
./build/bin/llama-server -m models/models/mistral-7b-instruct-v0.2.Q4_K_S.gguf
curl -H "Content-Type: application/json" -d '{"model":"gpt-3.5-turbo", "prompt": "something", "logprobs": 1, "max_tokens": 1}' http://127.0.0.1:8080/v1/completions | jq
Problem description & steps to reproduce
The logprobs
field of the response is left as null
despite logprobs being requested in the API call.
First Bad Commit
No response
Relevant log output
{
"choices": [
{
"text": " Ge",
"index": 0,
"logprobs": null,
"finish_reason": "length"
}
],
"created": 1737534666,
"model": "gpt-3.5-turbo",
"system_fingerprint": "b4501-667d7284",
"object": "text_completion",
"usage": {
"completion_tokens": 1,
"prompt_tokens": 2,
"total_tokens": 3
},
"id": "chatcmpl-6ppJao4m9PSUS61B7zrPdaNoWG5L0wj8",
"timings": {
"prompt_n": 2,
"prompt_ms": 796.332,
"prompt_per_token_ms": 398.166,
"prompt_per_second": 2.511515297639678,
"predicted_n": 1,
"predicted_ms": 0.032,
"predicted_per_token_ms": 0.032,
"predicted_per_second": 31250.0
}
}