llama : one-off chat template fix for Mistral-Small-2503 #13398

ngxson · 2025-05-09T08:03:20Z

Mistral-Small-3.1-24B-Instruct-2503-GGUF does not have a default chat template, while there are already a lot of GGUF quants on the internet.

This is a one-off fix to prevent any issues that user may report about model's performance

eskeletor97 · 2025-05-09T08:07:39Z

I don't know if that's correct either. V7 isn't the same as V7-tekken.
Regular V7 has spaces between sequences while tekken version doesn't. I don't know how much it impacts the performance, but it is a difference.

ngxson · 2025-05-09T08:09:53Z

I don't think V7-tekken even exists. Did you mean V3-tekken?

    { "mistral-v1",        LLM_CHAT_TEMPLATE_MISTRAL_V1        },
    { "mistral-v3",        LLM_CHAT_TEMPLATE_MISTRAL_V3        },
    { "mistral-v3-tekken", LLM_CHAT_TEMPLATE_MISTRAL_V3_TEKKEN },
    { "mistral-v7",        LLM_CHAT_TEMPLATE_MISTRAL_V7        },

I think mistral-small should use the same template as pixtral, though I'm not sure if it's v7 or v3

eskeletor97 · 2025-05-09T08:12:07Z

Oh, so llama.cpp doesn't support that template? I mean it's listed in the description of the model and in their mistral-common library (they insist on using that as the ground truth).
Yes, V7-tekken exists.

ngxson · 2025-05-09T08:17:13Z

Ok thanks for the info, yes there is a diff between v7 and v7-tekken regarding the leading space, it should be correct with the latest commit

* origin/master: (39 commits) server : vision support via libmtmd (ggml-org#12898) sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (ggml-org#12858) metal : optimize MoE for large batches (ggml-org#13388) CUDA: FA support for Deepseek (Ampere or newer) (ggml-org#13306) llama : do not crash if there is no CPU backend (ggml-org#13395) CUDA: fix crash on large batch size for MoE models (ggml-org#13384) imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (ggml-org#13389) llama-run: add support for downloading models from ModelScope (ggml-org#13370) mtmd : fix batch_view for m-rope (ggml-org#13397) llama : one-off chat template fix for Mistral-Small-2503 (ggml-org#13398) rpc : add rpc_msg_set_tensor_hash_req (ggml-org#13353) vulkan: Allow up to 4096 elements for mul_mat_id row_ids (ggml-org#13326) server : (webui) rename has_multimodal --> modalities (ggml-org#13393) ci : limit write permission to only the release step + fixes (ggml-org#13392) mtmd : Expose helper_decode_image_chunk (ggml-org#13366) server : (webui) fix a very small misalignment (ggml-org#13387) server : (webui) revamp the input area, plus many small UI improvements (ggml-org#13365) convert : support rope_scaling type and rope_type (ggml-org#13349) mtmd : fix the calculation of n_tokens for smolvlm (ggml-org#13381) context : allow cache-less context for embeddings (ggml-org#13108) ...

ngxson added 2 commits May 9, 2025 09:49

llama : one-off chat template fix for Mistral-Small-2503

cbc12b9

update readme

c6e0f92

ngxson requested a review from ggerganov May 9, 2025 08:03

github-actions bot added the examples label May 9, 2025

add mistral-v7-tekken

7e0f4f2

ngxson mentioned this pull request May 9, 2025

server : vision support via libmtmd #12898

Merged

8 tasks

ggerganov approved these changes May 9, 2025

View reviewed changes

ngxson merged commit 3f96aef into ggml-org:master May 9, 2025
44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : one-off chat template fix for Mistral-Small-2503 #13398

llama : one-off chat template fix for Mistral-Small-2503 #13398

ngxson commented May 9, 2025

eskeletor97 commented May 9, 2025

ngxson commented May 9, 2025 •

edited

Loading

eskeletor97 commented May 9, 2025

ngxson commented May 9, 2025

llama : one-off chat template fix for Mistral-Small-2503 #13398

llama : one-off chat template fix for Mistral-Small-2503 #13398

Conversation

ngxson commented May 9, 2025

eskeletor97 commented May 9, 2025

ngxson commented May 9, 2025 • edited Loading

eskeletor97 commented May 9, 2025

ngxson commented May 9, 2025

ngxson commented May 9, 2025 •

edited

Loading