-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Closed
Labels
bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
Description
What happened?
Steps to reproduce:
- Download https://huggingface.co/TheBloke/Mistral-7B-Code-16K-qlora-GGUF/blob/main/mistral-7b-code-16k-qlora.Q4_K_M.gguf (this is related to general.name containing the string "code")
- Build with
-DCMAKE_BUILD_TYPE=Debug
on Linux to enable libstdc++ assertions (makes the UB more obvious) - Try to load the model (it crashes):
$ build/bin/main -m mistral-7b-code-16k-qlora.Q4_K_M.gguf -ngl 99 -n 0 -p ''
llm_load_print_meta: BOS token = 1 '<s>'
llm_load_print_meta: EOS token = 2 '</s>'
llm_load_print_meta: UNK token = 0 '<unk>'
llm_load_print_meta: PAD token = 0 '<unk>'
llm_load_print_meta: LF token = 13 '<0x0A>'
/usr/include/c++/14.1.1/bits/stl_vector.h:1149: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](size_type) const [with _Tp = llama_vocab::token_data; _Alloc = std::allocator<llama_vocab::token_data>; const_reference = const llama_vocab::token_data&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
[1] 40137 IOT instruction (core dumped) build/bin/main -m mistral-7b-code-16k-qlora.Q4_K_M.gguf
When printing the PRE token it crashes, because the token ID is 32007 (the default for "code" models) but n_vocab is only 32000. Before #6745, the models loads successfully.
Name and Version
$ build/bin/main --version
version: 2702 (b97bc3966)
built with gcc (GCC) 14.1.1 20240522 for x86_64-pc-linux-gnu
What operating system are you seeing the problem on?
No response
Relevant log output
No response
kosimas
Metadata
Metadata
Assignees
Labels
bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)