-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Description
What happened?
I am trying to convert the bert-base-uncased HuggingFace model to GGUF with convert-hf-to-gguf.py Unfortunately, it fails to convert because the script looks for embeddings.position_embeddings
, etc. in tensor-mapping.py but not bert.embeddings.position_embeddings
, etc. This is important because most of the tensor names in the model start with bert
.
There is a similar issue in modify-tensors. It does not skip the cls
tensors that are present in bert-base-uncased, so it fails in the same way.
Finally, the bert-base-uncased config.json
has its architecture set to BertForMaskedLM
. But unless I change this to BertModel
the script will return ERROR:hf-to-gguf:Model BertForMaskedLM is not supported
.
I have modified write-tensors to drop 'bert.' from each name, and modified modify-tensors to ignore names with cls.
in them. Re-running the script will output "Model successfully exported", but when I try to generate an embedding with the model on the llama-cli
I get this subsequent error:
llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd_norm.weight' not found
.
Name and Version
version: 3051 (5921b8f)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Steps to Reproduce
Clone bert-base-uncased and run the convert script on it.
To see my attempted solution, do these steps:
- Change
BertForMaskedLM
toBertModel
inbert-base-uncased/config.json
. - Run the convert script in my fork of llama.cpp on the directory.
- Run
./llama-cli -m
on the converted model.
Relevant log output
$ python convert-hf-to-gguf.py /home/nsage/bert-base-uncased/ --outtype f16 --outfile /home/nsage/bert-base-uncased/bert_converted.gguf
INFO:hf-to-gguf:Loading model: bert-base-uncased
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 512
INFO:hf-to-gguf:gguf: embedding length = 768
INFO:hf-to-gguf:gguf: feed forward length = 3072
INFO:hf-to-gguf:gguf: head count = 12
INFO:hf-to-gguf:gguf: layer norm epsilon = 1e-12
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type pad to 0
INFO:hf-to-gguf:Exporting model to '/home/nsage/bert-base-uncased/bert_converted.gguf'
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
Traceback (most recent call last):
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2862, in <module>
main()
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2856, in main
model_instance.write()
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 328, in write
self.write_tensors()
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 265, in write_tensors
for new_name, data in ((n, d.squeeze().numpy()) for n, d in self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2190, in modify_tensors
return [(self.map_tensor_name(name), data_torch)]
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 180, in map_tensor_name
raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'bert.embeddings.LayerNorm.beta'
$ ./llama-cli -m /home/nsage/bert-base-uncased/ggml-model-Q4_0.gguf -p "The sky is blue."
...
llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd_norm.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/home/nsage/bert-base-uncased/ggml-model-Q4_0.gguf'
main: error: unable to load model