llama : use F32 precision in Qwen2 attention and no FA #8412

ggerganov · 2024-07-10T14:34:11Z

There have been few reports of these models generating "GGGG" when FA is disabled. This should it

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

llama : use F32 precision in Qwen2 attention and no FA

7c9e9a2

JohannesGaessler approved these changes Jul 10, 2024

View reviewed changes

ggerganov merged commit 7a221b6 into master Jul 11, 2024
54 checks passed

ggerganov deleted the gg/qwen2-f32-prec branch July 11, 2024 07:21

ggerganov mentioned this pull request Jul 11, 2024

chore: enable fast attention for Qwen2-1.5B-Instruct model TabbyML/tabby#2592

Merged

tobias-varden mentioned this pull request Jul 11, 2024

glm-4-9b-chat responding not correctly ollama/ollama#5563

Closed

loveyume520 mentioned this pull request Jul 12, 2024

glm4 直接报错了 ollama/ollama#5647

Closed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024

llama : use F32 precision in Qwen2 attention and no FA (ggml-org#8412)

2186990

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024

llama : use F32 precision in Qwen2 attention and no FA (ggml-org#8412)

2ed5fd5

ggerganov mentioned this pull request Jul 16, 2024

Support glm3 and glm4. #8031

Merged

4 tasks

tin2tin mentioned this pull request Aug 19, 2024

[Request] Add LongWriter model(s) nomic-ai/gpt4all#2883

Open

piDack mentioned this pull request Aug 22, 2024

llama:use F32 precision in GLM4 attention and no FA #9130

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : use F32 precision in Qwen2 attention and no FA #8412

llama : use F32 precision in Qwen2 attention and no FA #8412

Uh oh!

ggerganov commented Jul 10, 2024

Uh oh!

Uh oh!

Uh oh!

llama : use F32 precision in Qwen2 attention and no FA #8412

llama : use F32 precision in Qwen2 attention and no FA #8412

Uh oh!

Conversation

ggerganov commented Jul 10, 2024

Uh oh!

Uh oh!

Uh oh!