quantize: Use UINT32 if there's an INT KV override #14197
Merged
+2
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When quantising models and overriding integer metadata parameters at the same time, their assigned type becomes
int
even though their original value isunsigned int
. In certain cases this behaviour triggers an exception when loading the quantised model.For example, using the model available here:
llama-quantize --override-kv qwen3moe.expert_used_count=int:16 Qwen3-30B-A3B-BF16.gguf Qwen3-30B-A3B-Q4_K_M.gguf Q4_K_M
llama-simple -m Qwen3-30B-A3B-Q4_K_M.gguf "Hello, world!"
Will lead to an
error loading model hyperparameters: key qwen3moe.expert_used_count has wrong type i32 but expected type u32 exception
This PR changes the
if (params->kv_overrides)
logic in llama-quant.cpp to useuint32
if there are anyint
overrides, so thatllama-quantize --override-kv qwen3moe.expert_used_count=int:16 Qwen3-30B-A3B-BF16.gguf Qwen3-30B-A3B-Q4_K_M.gguf Q4_K_M
generates a functioning modelMore context here