Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Misc. bug: Log probabilities of tokens are not produced for the /v1/completions endpoint #11346

Closed
@jpodivin

Description

@jpodivin

Name and Version

$ ./build/bin/llama-server --version
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (12th Gen Intel(R) Core(TM) i7-12800H)
version: 4501 (667d7284)
built with cc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) for x86_64-redhat-linux

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

./build/bin/llama-server -m models/models/mistral-7b-instruct-v0.2.Q4_K_S.gguf

curl -H "Content-Type: application/json"  -d '{"model":"gpt-3.5-turbo", "prompt": "something", "logprobs": 1, "max_tokens": 1}' http://127.0.0.1:8080/v1/completions | jq

Problem description & steps to reproduce

The logprobs field of the response is left as null despite logprobs being requested in the API call.

First Bad Commit

No response

Relevant log output

{
  "choices": [
    {
      "text": " Ge",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "created": 1737534666,
  "model": "gpt-3.5-turbo",
  "system_fingerprint": "b4501-667d7284",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 1,
    "prompt_tokens": 2,
    "total_tokens": 3
  },
  "id": "chatcmpl-6ppJao4m9PSUS61B7zrPdaNoWG5L0wj8",
  "timings": {
    "prompt_n": 2,
    "prompt_ms": 796.332,
    "prompt_per_token_ms": 398.166,
    "prompt_per_second": 2.511515297639678,
    "predicted_n": 1,
    "predicted_ms": 0.032,
    "predicted_per_token_ms": 0.032,
    "predicted_per_second": 31250.0
  }
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions