llama : Add IBM granite template #10013

arch-btw · 2024-10-23T11:29:41Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

I'd like to contribute the IBM granite template to llama.cpp.

I've made my best effort but I'm new to this, so could you please be so kind to review? Maybe you can see if it's alright.

I've followed the wiki instructions but feel free to make changes or provide feedback on how to improve.

The model is here: https://huggingface.co/ibm-granite/granite-3.0-8b-instruct

Here is the current output of llama-cli:

main: chat template example:
<|start_of_role|>system<|end_of_role|>
You are a helpful assistant<|end_of_text|>
<|start_of_role|>user<|end_of_role|>
Hello<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>
Hi there<|end_of_text|>
<|start_of_role|>user<|end_of_role|>
How are you?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

--

src/llama.cpp

tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

arch-btw · 2024-10-26T12:25:42Z

@ngxson thank you so much for your help, I appreciate your time and effort!

I have applied your review 👍

I have a quick question, when I run test-chat-template is it normal that it looks like this (screenshot below).
The part that confuses me is that <end_of_sentence> is not escaped but the other tags are.
And then user/assistant are on the same line, twice.

I'm not sure what it should look like in the context of this script, considering the other prompts have things like additional spacing as well.

Edit: sorry, I'm looking at the deepseek template. I guess mine failed:

terminate called after throwing an instance of 'std::length_error'
what(): vector::_M_default_append
fish: Job 1, './tests/test-chat-template' terminated by signal SIGABRT (Abort)

ngxson · 2024-10-26T12:49:53Z

You need to add an example of an unformatted chat template in test-chat-template. Please look carefully in the file.

Small change to \n

arch-btw · 2024-10-27T01:21:35Z

Hi @ngxson

Thank you again for your guidance!
I finally managed to get it to work! 😅

Here is the output of make tests/test-chat-template && ./tests/test-chat-template:

And here is the output of ./llama-cli -m granite-3.0-8B-instruct-F32-Q5_K_M.gguf -t 4 -p 'You are granite, an AI model.' --conversation --color --chat-template granite:

And a prompt example:

ngxson

Small changes before I can merge

src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

arch-btw · 2024-10-28T11:33:01Z

@ngxson Thank you again and sorry about that!

ngxson

Thanks, merging once the CI passes

src/llama.cpp

Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>

* Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <[email protected]> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <[email protected]>

Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>

arch-btw added 3 commits October 23, 2024 04:12

Add granite template to llama.cpp

e74773e

Merge branch 'ggerganov:master' into master

3614aac

Add granite template to test-chat-template.cpp

a6679d9

github-actions bot added the testing Everything test related label Oct 23, 2024

ngxson requested changes Oct 23, 2024

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

tests/test-chat-template.cpp Outdated Show resolved Hide resolved

Update src/llama.cpp

26f0911

Co-authored-by: Xuan Son Nguyen <[email protected]>

arch-btw changed the title ~~Add IBM granite template~~ llama : Add IBM granite template Oct 26, 2024

Update tests/test-chat-template.cpp

8fe174d

Co-authored-by: Xuan Son Nguyen <[email protected]>

arch-btw added 2 commits October 26, 2024 18:03

Added proper template and expected output

3cca307

Small change to \n

60ed870

Small change to \n

arch-btw requested a review from ngxson October 27, 2024 01:22

ngxson reviewed Oct 28, 2024

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

src/llama.cpp Outdated Show resolved Hide resolved

arch-btw and others added 2 commits October 28, 2024 04:20

Add code space &

50ef6ca

Co-authored-by: Xuan Son Nguyen <[email protected]>

Fix spacing

839cf4c

ngxson approved these changes Oct 28, 2024

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

src/llama.cpp Outdated Show resolved Hide resolved

Apply suggestions from code review

8b0b64b

ngxson reviewed Oct 28, 2024

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

Update src/llama.cpp

3373388

ngxson merged commit 61715d5 into ggml-org:master Oct 28, 2024
53 checks passed

gabe-l-hart mentioned this pull request Nov 4, 2024

Granite three support Mozilla-Ocho/llamafile#608

Merged

gabe-l-hart mentioned this pull request Nov 5, 2024

Add the <|tool_call|> formatting to the granite template #10177

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : Add IBM granite template #10013

llama : Add IBM granite template #10013

arch-btw commented Oct 23, 2024

arch-btw commented Oct 26, 2024 •

edited

Loading

ngxson commented Oct 26, 2024 •

edited

Loading

arch-btw commented Oct 27, 2024

ngxson left a comment •

edited

Loading

arch-btw commented Oct 28, 2024

ngxson left a comment

llama : Add IBM granite template #10013

llama : Add IBM granite template #10013

Conversation

arch-btw commented Oct 23, 2024

arch-btw commented Oct 26, 2024 • edited Loading

ngxson commented Oct 26, 2024 • edited Loading

arch-btw commented Oct 27, 2024

ngxson left a comment • edited Loading

Choose a reason for hiding this comment

arch-btw commented Oct 28, 2024

ngxson left a comment

Choose a reason for hiding this comment

arch-btw commented Oct 26, 2024 •

edited

Loading

ngxson commented Oct 26, 2024 •

edited

Loading

ngxson left a comment •

edited

Loading