Misc. bug: crashes when calling `llama_state_get_size` on a reranking model #13463

giladgd · 2025-05-12T01:04:20Z

Name and Version

Compiled at commit 6562e5a,
but it also happens on the latest master version.

Operating systems

Mac

Which llama.cpp modules do you know to be affected?

libllama (core library)

Problem description & steps to reproduce

The process crashes with SIGSEGV when calling llama_state_get_size on a context created with a reranking model (bge-reranker-v2-m3-Q8_0.gguf in my tests).

Here's a simple reproduction code:

void repro() {
    llama_backend_init();

    auto model_params = llama_model_default_params();
    model_params.n_gpu_layers = 33;

    auto model_path = "/home/user/models/bge-reranker-v2-m3-Q8_0.gguf";
    auto model = llama_model_load_from_file(model_path, model_params);
    fputs("model loaded\n", stdout);
    fflush(stdout);

    auto context_params = llama_context_default_params();
    context_params.embeddings = true;
    context_params.pooling_type = LLAMA_POOLING_TYPE_RANK;
    auto ctx = llama_init_from_model(model, context_params);
    fputs("context created\n", stdout);
    fflush(stdout);

    auto state_size = llama_state_get_size(ctx); // <- crashes here
    fputs(("State size: " + std::to_string(state_size) + "\n").c_str(), stdout);
    fflush(stdout);

    llama_free(ctx);
    llama_model_free(model);

    llama_backend_free();
}

First Bad Commit

The issue was introduced at commit 6562e5a (PR #13108)

Relevant log output

Last logs before the crash:

set_abort_callback: call
llama_context:        CPU  output buffer size =     0.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 3
llama_context: max_nodes = 65536
context created
state_write_data: writing state
state_write_data: - writing model info
state_write_data: - writing output ids
state_write_data: - writing logits
state_write_data: - writing embeddings
state_write_data: - writing KV self

From lldb:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000104d2dd08 libllama.dylib`llama_context::state_write_data(llama_io_write_i&) + 684
    frame #1: 0x0000000104d2d9e4 libllama.dylib`llama_context::state_get_size() + 40
    frame #2: 0x000000010456f708 llama-addon.node`repro() + 240

The text was updated successfully, but these errors were encountered:

ggerganov · 2025-05-12T07:36:21Z

See if #13470 fixes the problem.

giladgd · 2025-05-12T22:00:32Z

The repro code still crashes with the latest master (f0d46ef)

giladgd added the bug-unconfirmed label May 12, 2025

ggerganov mentioned this issue May 12, 2025

context : fix state io for memory-less contexts #13470

Merged

ggerganov closed this as completed in #13470 May 12, 2025

giladgd mentioned this issue May 14, 2025

fix: crash when calling llama_state_get_size on a context without a KV cache #13542

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: crashes when calling `llama_state_get_size` on a reranking model #13463

Misc. bug: crashes when calling `llama_state_get_size` on a reranking model #13463

giladgd commented May 12, 2025 •

edited

Loading

ggerganov commented May 12, 2025

giladgd commented May 12, 2025

Misc. bug: crashes when calling llama_state_get_size on a reranking model #13463

Misc. bug: crashes when calling llama_state_get_size on a reranking model #13463

Comments

giladgd commented May 12, 2025 • edited Loading

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Problem description & steps to reproduce

First Bad Commit

Relevant log output

ggerganov commented May 12, 2025

giladgd commented May 12, 2025

Misc. bug: crashes when calling `llama_state_get_size` on a reranking model #13463

Misc. bug: crashes when calling `llama_state_get_size` on a reranking model #13463

giladgd commented May 12, 2025 •

edited

Loading