Closed
Description
Name and Version
llama-server -hf nomic-ai/nomic-embed-text-v2-moe-GGUF:Q4_K_M --embeddings
this version is OK
llama-server --version
version: 5569 (e57bb87c)
built with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2.0.1) for x86_64-redhat-linux
All subsequent versions include latest version have issues.
Operating systems
Linux
GGML backends
CPU
Hardware
INTEL(R) XEON(R) GOLD 6530
Models
No response
Problem description & steps to reproduce
llama-server -hf nomic-ai/nomic-embed-text-v2-moe-GGUF:Q4_K_M --embeddings
fail load model
First Bad Commit
No response
Relevant log output
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 321.66 MiB (5.68 BPW)
load: model vocab missing newline token, using special_pad_id instead
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 4
load: token to piece cache size = 2.1668 MB
llama_model_load: error loading model: error loading model vocabulary: _Map_base::at
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/root/.cache/llama.cpp/nomic-ai_nomic-embed-text-v2-moe-GGUF_nomic-embed-text-v2-moe.Q4_K_M.gguf'
srv load_model: failed to load model, '/root/.cache/llama.cpp/nomic-ai_nomic-embed-text-v2-moe-GGUF_nomic-embed-text-v2-moe.Q4_K_M.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error