Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Bug: Some "code" models invoke undefined behavior at load time after #6745 #7592

@cebtenzzre

Description

@cebtenzzre

What happened?

Steps to reproduce:

$ build/bin/main -m mistral-7b-code-16k-qlora.Q4_K_M.gguf -ngl 99 -n 0 -p ''
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: PAD token        = 0 '<unk>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
/usr/include/c++/14.1.1/bits/stl_vector.h:1149: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](size_type) const [with _Tp = llama_vocab::token_data; _Alloc = std::allocator<llama_vocab::token_data>; const_reference = const llama_vocab::token_data&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
[1]    40137 IOT instruction (core dumped)  build/bin/main -m mistral-7b-code-16k-qlora.Q4_K_M.gguf

When printing the PRE token it crashes, because the token ID is 32007 (the default for "code" models) but n_vocab is only 32000. Before #6745, the models loads successfully.

Name and Version

$ build/bin/main --version
version: 2702 (b97bc3966)
built with gcc (GCC) 14.1.1 20240522 for x86_64-pc-linux-gnu

What operating system are you seeing the problem on?

No response

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions