Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Misc. bug: crashes when calling llama_state_get_size on a reranking model #13463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
giladgd opened this issue May 12, 2025 · 2 comments · Fixed by #13470 or #13542
Closed

Misc. bug: crashes when calling llama_state_get_size on a reranking model #13463

giladgd opened this issue May 12, 2025 · 2 comments · Fixed by #13470 or #13542

Comments

@giladgd
Copy link
Contributor

giladgd commented May 12, 2025

Name and Version

Compiled at commit 6562e5a,
but it also happens on the latest master version.

Operating systems

Mac

Which llama.cpp modules do you know to be affected?

libllama (core library)

Problem description & steps to reproduce

The process crashes with SIGSEGV when calling llama_state_get_size on a context created with a reranking model (bge-reranker-v2-m3-Q8_0.gguf in my tests).

Here's a simple reproduction code:

void repro() {
    llama_backend_init();

    auto model_params = llama_model_default_params();
    model_params.n_gpu_layers = 33;

    auto model_path = "/home/user/models/bge-reranker-v2-m3-Q8_0.gguf";
    auto model = llama_model_load_from_file(model_path, model_params);
    fputs("model loaded\n", stdout);
    fflush(stdout);

    auto context_params = llama_context_default_params();
    context_params.embeddings = true;
    context_params.pooling_type = LLAMA_POOLING_TYPE_RANK;
    auto ctx = llama_init_from_model(model, context_params);
    fputs("context created\n", stdout);
    fflush(stdout);

    auto state_size = llama_state_get_size(ctx); // <- crashes here
    fputs(("State size: " + std::to_string(state_size) + "\n").c_str(), stdout);
    fflush(stdout);

    llama_free(ctx);
    llama_model_free(model);

    llama_backend_free();
}

First Bad Commit

The issue was introduced at commit 6562e5a (PR #13108)

Relevant log output

Last logs before the crash:

set_abort_callback: call
llama_context:        CPU  output buffer size =     0.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 3
llama_context: max_nodes = 65536
context created
state_write_data: writing state
state_write_data: - writing model info
state_write_data: - writing output ids
state_write_data: - writing logits
state_write_data: - writing embeddings
state_write_data: - writing KV self

From lldb:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000104d2dd08 libllama.dylib`llama_context::state_write_data(llama_io_write_i&) + 684
    frame #1: 0x0000000104d2d9e4 libllama.dylib`llama_context::state_get_size() + 40
    frame #2: 0x000000010456f708 llama-addon.node`repro() + 240
@ggerganov
Copy link
Member

See if #13470 fixes the problem.

@giladgd
Copy link
Contributor Author

giladgd commented May 12, 2025

The repro code still crashes with the latest master (f0d46ef)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants