A simple function that reads metadata from GGUF files. It has no lisp or external dependencies so there's no need for a llama.cpp build or other tooling.
GGUF (GPT-Generated Unified Format) is a binary file format designed for efficient storage and deployment of large language models. GGUF file format description
If the code reads as suboptimal, it probably is, some code was generated by an LLM as a quick and easy means to an end.
This has only been tested by hand on some HuggingFace model files I had hanging around. I haven't tested it on both little-endian and big-endian file formats, I suspect it works only for little-endian model files, the spec may now support big-endian formats as well.
This module is available via ultralisp.
(ql:quickload :gguf-metadata)
(in-package :gguf-metadata)
;; Retrive a dict with all keys and values in the file's metdata.
(read-gguf-metadata #P"/models/mistral-7b-instruct.Q4_K_M.gguf") => dict
;; Print some of the data to *standard-output*
(report-gguf-metadata #P"/models/mistral-7b-instruct.Q4_K_M.gguf")
NOTE: some of the values of keys starting with "tokenizer." can be quite large, you may not want to print them in your REPL buffer.
Some example output: (that assumes alexandria is in your available packages):
(let ((ht (gguf-metadata:report-gguf-metadata "~/models/Mistral-7B-Instruct-v0.3.Q5_K_M.gguf")))
(format t "~%Keys: ~s~%" (sort (alexandria:hash-table-keys ht) #'string<)))
GGUF version 3, 291 tensors, 29 metadata entries
Model name: models--mistralai--Mistral-7B-Instruct-v0.3
Model version: unknown
Architecture: llama
Quantization: MOSTLY_Q5_K_M
Size label: unknown
Quantization version: 2
llama.context_length: 32768
Keys: ("general.architecture" "general.file_type" "general.name"
"general.quantization_version" "llama.attention.head_count"
"llama.attention.head_count_kv" "llama.attention.layer_norm_rms_epsilon"
"llama.block_count" "llama.context_length" "llama.embedding_length"
"llama.feed_forward_length" "llama.rope.dimension_count"
"llama.rope.freq_base" "llama.vocab_size"
"quantize.imatrix.chunks_count" "quantize.imatrix.dataset"
"quantize.imatrix.entries_count" "quantize.imatrix.file"
"tokenizer.chat_template" "tokenizer.ggml.add_bos_token"
"tokenizer.ggml.add_eos_token" "tokenizer.ggml.bos_token_id"
"tokenizer.ggml.eos_token_id" "tokenizer.ggml.model"
"tokenizer.ggml.pre" "tokenizer.ggml.scores" "tokenizer.ggml.token_type"
"tokenizer.ggml.tokens" "tokenizer.ggml.unknown_token_id")
All on Fedora X86_64, which hopefully doesn't matter for this package.
Worked on these lisps:
- SBCL 2.5.3
- CCL 1.13
- ECL 24.5.10
- ABCL 1.9.2 on JVM 21.0.8