Read llama.cpp V3 metadata from .gguf model files

A simple function that reads metadata from GGUF files. It has no lisp or external dependencies so there's no need for a llama.cpp build or other tooling.

GGUF (GPT-Generated Unified Format) is a binary file format designed for efficient storage and deployment of large language models. GGUF file format description

If the code reads as suboptimal, it probably is, some code was generated by an LLM as a quick and easy means to an end.

This has only been tested by hand on some HuggingFace model files I had hanging around. I haven't tested it on both little-endian and big-endian file formats, I suspect it works only for little-endian model files, the spec may now support big-endian formats as well.

Installation, Usage

This module is available via ultralisp.

(ql:quickload :gguf-metadata)

(in-package :gguf-metadata)

;; Retrive a dict with all keys and values in the file's metdata.
(read-gguf-metadata #P"/models/mistral-7b-instruct.Q4_K_M.gguf") => dict

;; Print some of the data to *standard-output*
(report-gguf-metadata #P"/models/mistral-7b-instruct.Q4_K_M.gguf")

NOTE: some of the values of keys starting with "tokenizer." can be quite large, you may not want to print them in your REPL buffer.

Some example output: (that assumes alexandria is in your available packages):

(let ((ht (gguf-metadata:report-gguf-metadata "~/models/Mistral-7B-Instruct-v0.3.Q5_K_M.gguf")))
  (format t "~%Keys: ~s~%" (sort (alexandria:hash-table-keys ht) #'string<)))

GGUF version 3, 291 tensors, 29 metadata entries
Model name: models--mistralai--Mistral-7B-Instruct-v0.3
Model version: unknown
Architecture: llama
Quantization: MOSTLY_Q5_K_M
Size label: unknown
Quantization version: 2
llama.context_length: 32768

Keys: ("general.architecture" "general.file_type" "general.name"
       "general.quantization_version" "llama.attention.head_count"
       "llama.attention.head_count_kv" "llama.attention.layer_norm_rms_epsilon"
       "llama.block_count" "llama.context_length" "llama.embedding_length"
       "llama.feed_forward_length" "llama.rope.dimension_count"
       "llama.rope.freq_base" "llama.vocab_size"
       "quantize.imatrix.chunks_count" "quantize.imatrix.dataset"
       "quantize.imatrix.entries_count" "quantize.imatrix.file"
       "tokenizer.chat_template" "tokenizer.ggml.add_bos_token"
       "tokenizer.ggml.add_eos_token" "tokenizer.ggml.bos_token_id"
       "tokenizer.ggml.eos_token_id" "tokenizer.ggml.model"
       "tokenizer.ggml.pre" "tokenizer.ggml.scores" "tokenizer.ggml.token_type"
       "tokenizer.ggml.tokens" "tokenizer.ggml.unknown_token_id")

Tested Lisps

All on Fedora X86_64, which hopefully doesn't matter for this package.

Worked on these lisps:

SBCL 2.5.3
CCL 1.13
ECL 24.5.10
ABCL 1.9.2 on JVM 21.0.8

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md
gguf-metadata.asd		gguf-metadata.asd
gguf-metadata.fasl		gguf-metadata.fasl
gguf-metadata.lisp		gguf-metadata.lisp
package.lisp		package.lisp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Read llama.cpp V3 metadata from .gguf model files

Installation, Usage

Tested Lisps

About

Uh oh!

Releases

Packages

Languages

License

dtenny/gguf-metadata

Folders and files

Latest commit

History

Repository files navigation

Read llama.cpp V3 metadata from .gguf model files

Installation, Usage

Tested Lisps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages