Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Misc. bug: Inconsistent Gemma3 implementation in rope factor #14367

Open
@joeldushouyu

Description

@joeldushouyu

Name and Version

$llama-cli --version
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (AMD Ryzen AI 9 HX 370 w/ Radeon 890M)
load_backend: failed to find ggml_backend_init in /home/shouyud/llama.cpp/build/bin/libggml-cpu.so
version: 5731 (bb16041)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

llama-mtmd-cli -m ./gemma-4bit-unsloth/gemma-3-4b-it-Q4_K_M.gguf --mmproj ./gemma-4bit-unsloth/mmproj-BF16.gguf

Problem description & steps to reproduce

Hello all,
While I was study how Gemma3 is implemented on llama.cpp(I am still new to llama.cpp), I noticed something blizzard in the implementation and took me sometime to find it out.

The issue is that in the transformer library of Gemma3 it hardcodes the rope-type for local_emb_rope(rope for sliding layer)

However, in the current implementation, it assumes the sliding and non-sliding windows are using the same freq_scale, freq_base, which differs from transformers implementation and got me confused when looking at the values by rope code during runtime debugging

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions