[Bug]: Corrupted token outputs (`???` / `<unused>`) on ROCm backend with `gfx1152` target (Ryzen AI 350)


### Description
When running inference with `llama.cpp` using the native ROCm (HIP) 7.14 backend (from tarball https://therock-nightly-tarball.s3.amazonaws.com) on a `gfx1152` APU, the execution runs at full speed, but the model outputs completely corrupted tokens. 


### Environment
* **OS:** CachyOS (Arch Linux-based)
* **Hardware:** AMD Ryzen 7 AI 350 with Radeon 860M (`gfx1152`)
* **ROCm Version:** 7.14.0a20260602 Nightly (built via `rocm-gfx1152-bin`)

### Reproduction Steps
1. Build and install ROCm 7.14 locally using the latest nightly tarball compiled for the `gfx1152` architecture.
2. Build `llama.cpp` from source with the `gfx1152` target flag

### Actual Behavior
1. Qwen models: The output is a uniform, infinite stream of ? or ??? characters.

2. Gemma models: The generator gets stuck in an endless loop outputting <unused24> (or similar unused/special tokens).

### Expected Behavior
The model should generate coherent text output, matching the results obtained via the Vulkan or CPU backends.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Corrupted token outputs (`???` / `<unused>`) on ROCm backend with `gfx1152` target (Ryzen AI 350) #5579

Description

Environment

Reproduction Steps

Actual Behavior

Expected Behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: Corrupted token outputs (??? / <unused>) on ROCm backend with gfx1152 target (Ryzen AI 350) #5579

Description

Description

Environment

Reproduction Steps

Actual Behavior

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: Corrupted token outputs (`???` / `<unused>`) on ROCm backend with `gfx1152` target (Ryzen AI 350) #5579