Bug: Error while converting BERT to GGUF: Can not map tensor 'bert.embeddings.LayerNorm.beta'

### What happened?

I am trying to convert the [bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) HuggingFace model to GGUF with [convert-hf-to-gguf.py](https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py) Unfortunately, it fails to convert because the script looks for `embeddings.position_embeddings`, etc. in [tensor-mapping.py](https://github.com/ggerganov/llama.cpp/blob/172c8256840ffd882ab9992ecedbb587d9b21f15/gguf-py/gguf/tensor_mapping.py#L44) but not `bert.embeddings.position_embeddings`, etc. This is important because most of the tensor names in the model start with `bert`.

There is a similar issue in [modify-tensors](https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py#L2192). It does not skip the `cls` tensors that are present in [bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased), so it fails in the same way.

Finally, the bert-base-uncased `config.json` has its architecture set to `BertForMaskedLM`. But unless I change this to `BertModel` the script will return `ERROR:hf-to-gguf:Model BertForMaskedLM is not supported`.

I have modified [write-tensors](https://github.com/ggerganov/llama.cpp/blob/172c8256840ffd882ab9992ecedbb587d9b21f15/convert-hf-to-gguf.py#L247) to drop 'bert.' from each name, and modified [modify-tensors](https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py#L2192) to ignore names with `cls.` in them. Re-running the script will output "Model successfully exported", but when I try to generate an embedding with the model on the `llama-cli` I get this subsequent error:
`llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd_norm.weight' not found`. 

### Name and Version

version: 3051 (5921b8f0)
built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu

### What operating system are you seeing the problem on?

Linux

### Steps to Reproduce

Clone [bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) and run the convert script on it.
To see my attempted solution, do these steps:

1. Change `BertForMaskedLM` to `BertModel` in `bert-base-uncased/config.json`.
2. Run the convert script in [my fork](https://github.com/Wheelspawn/llama.cpp) of llama.cpp on the directory.
3. Run `./llama-cli -m` on the converted model.

### Relevant log output

```shell
$ python convert-hf-to-gguf.py /home/nsage/bert-base-uncased/ --outtype f16 --outfile /home/nsage/bert-base-uncased/bert_converted.gguf
INFO:hf-to-gguf:Loading model: bert-base-uncased
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 512
INFO:hf-to-gguf:gguf: embedding length = 768
INFO:hf-to-gguf:gguf: feed forward length = 3072
INFO:hf-to-gguf:gguf: head count = 12
INFO:hf-to-gguf:gguf: layer norm epsilon = 1e-12
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type pad to 0
INFO:hf-to-gguf:Exporting model to '/home/nsage/bert-base-uncased/bert_converted.gguf'
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
Traceback (most recent call last):
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2862, in <module>
    main()
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2856, in main
    model_instance.write()
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 328, in write
    self.write_tensors()
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 265, in write_tensors
    for new_name, data in ((n, d.squeeze().numpy()) for n, d in self.modify_tensors(data_torch, name, bid)):
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 2190, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nsage/ollama/llm/llama.cpp/convert-hf-to-gguf.py", line 180, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'bert.embeddings.LayerNorm.beta'
```

```
$ ./llama-cli -m /home/nsage/bert-base-uncased/ggml-model-Q4_0.gguf -p "The sky is blue."
...
llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd_norm.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/home/nsage/bert-base-uncased/ggml-model-Q4_0.gguf'
main: error: unable to load model
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: Error while converting BERT to GGUF: Can not map tensor 'bert.embeddings.LayerNorm.beta' #7924

What happened?

Name and Version

What operating system are you seeing the problem on?

Steps to Reproduce

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: Error while converting BERT to GGUF: Can not map tensor 'bert.embeddings.LayerNorm.beta' #7924

Description

What happened?

Name and Version

What operating system are you seeing the problem on?

Steps to Reproduce

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions