Description
Name and Version
$llama-cli --version
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (AMD Ryzen AI 9 HX 370 w/ Radeon 890M)
load_backend: failed to find ggml_backend_init in /home/shouyud/llama.cpp/build/bin/libggml-cpu.so
version: 5731 (bb16041)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
llama-mtmd-cli -m ./gemma-4bit-unsloth/gemma-3-4b-it-Q4_K_M.gguf --mmproj ./gemma-4bit-unsloth/mmproj-BF16.gguf
Problem description & steps to reproduce
Hello all,
While I was study how Gemma3 is implemented on llama.cpp(I am still new to llama.cpp), I noticed something blizzard in the implementation and took me sometime to find it out.
The issue is that in the transformer library of Gemma3 it hardcodes the rope-type for local_emb_rope(rope for sliding layer)
However, in the current implementation, it assumes the sliding and non-sliding windows are using the same freq_scale, freq_base, which differs from transformers implementation and got me confused when looking at the values by rope code during runtime debugging
First Bad Commit
No response