Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Misc. bug: n_probs is not working with llama.cpp server #10733

Closed
@henryclw

Description

@henryclw

Name and Version

build: 4291 (ce8784b) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

Docker image name: ggerganov/llama.cpp:server-cuda
Docker image hash: sha256:8fa3ccfdcd21874c8a8b257b6bf6abf10070d612e00394b477ec124bd56f2d12

Operating systems

No response

Which llama.cpp modules do you know to be affected?

llama-server

Problem description & steps to reproduce

Started the server with no speculative decoding.

curl --request POST \
     --url http://localhost:8080/completion \
     --header "Content-Type: application/json" \
     --data '{"prompt": "Why is the sky is blue?", "n_probs": 10}'

The output doesn't contain completion_probabilities, which it should

First Bad Commit

HINT:

For docker image server-cuda-b4274, n_probs is working as expected
For docker image server-cuda-b4277, n_probs is not working

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions