Thanks to visit codestin.com
Credit goes to github.com

Skip to content

quantize: Use UINT32 if there's an INT KV override #14197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 15, 2025

Conversation

EAddario
Copy link
Contributor

When quantising models and overriding integer metadata parameters at the same time, their assigned type becomes int even though their original value is unsigned int. In certain cases this behaviour triggers an exception when loading the quantised model.

For example, using the model available here:

  1. Quantise & override: llama-quantize --override-kv qwen3moe.expert_used_count=int:16 Qwen3-30B-A3B-BF16.gguf Qwen3-30B-A3B-Q4_K_M.gguf Q4_K_M
  2. Load model: llama-simple -m Qwen3-30B-A3B-Q4_K_M.gguf "Hello, world!"

Will lead to an error loading model hyperparameters: key qwen3moe.expert_used_count has wrong type i32 but expected type u32 exception

This PR changes the if (params->kv_overrides) logic in llama-quant.cpp to use uint32 if there are any int overrides, so that llama-quantize --override-kv qwen3moe.expert_used_count=int:16 Qwen3-30B-A3B-BF16.gguf Qwen3-30B-A3B-Q4_K_M.gguf Q4_K_M generates a functioning model

More context here

@CISC CISC merged commit 30e5b01 into ggml-org:master Jun 15, 2025
47 checks passed
@EAddario EAddario deleted the quant_i32 branch June 15, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants