Closed
Description
Name and Version
build: 4291 (ce8784b) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Docker image name: ggerganov/llama.cpp:server-cuda
Docker image hash: sha256:8fa3ccfdcd21874c8a8b257b6bf6abf10070d612e00394b477ec124bd56f2d12
Operating systems
No response
Which llama.cpp modules do you know to be affected?
llama-server
Problem description & steps to reproduce
Started the server with no speculative decoding.
curl --request POST \
--url http://localhost:8080/completion \
--header "Content-Type: application/json" \
--data '{"prompt": "Why is the sky is blue?", "n_probs": 10}'
The output doesn't contain completion_probabilities
, which it should
First Bad Commit
HINT:
For docker image server-cuda-b4274
, n_probs is working as expected
For docker image server-cuda-b4277
, n_probs is not working
Relevant log output
No response