Thanks to visit codestin.com
Credit goes to github.com

Skip to content

llama : Add IBM granite template #10013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Oct 28, 2024
Merged

llama : Add IBM granite template #10013

merged 11 commits into from
Oct 28, 2024

Conversation

arch-btw
Copy link
Contributor

Hi @ggerganov and @ngxson,

I'd like to contribute the IBM granite template to llama.cpp.

I've made my best effort but I'm new to this, so could you please be so kind to review? Maybe you can see if it's alright.

I've followed the wiki instructions but feel free to make changes or provide feedback on how to improve.

The model is here: https://huggingface.co/ibm-granite/granite-3.0-8b-instruct

Here is the current output of llama-cli:

main: chat template example:
<|start_of_role|>system<|end_of_role|>
You are a helpful assistant<|end_of_text|>
<|start_of_role|>user<|end_of_role|>
Hello<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>
Hi there<|end_of_text|>
<|start_of_role|>user<|end_of_role|>
How are you?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

--

granite

@github-actions github-actions bot added the testing Everything test related label Oct 23, 2024
Co-authored-by: Xuan Son Nguyen <[email protected]>
@arch-btw arch-btw changed the title Add IBM granite template llama : Add IBM granite template Oct 26, 2024
@arch-btw
Copy link
Contributor Author

arch-btw commented Oct 26, 2024

@ngxson thank you so much for your help, I appreciate your time and effort!

I have applied your review 👍

I have a quick question, when I run test-chat-template is it normal that it looks like this (screenshot below).
The part that confuses me is that <end_of_sentence> is not escaped but the other tags are.
And then user/assistant are on the same line, twice.

I'm not sure what it should look like in the context of this script, considering the other prompts have things like additional spacing as well.

Edit: sorry, I'm looking at the deepseek template. I guess mine failed:

terminate called after throwing an instance of 'std::length_error'
what(): vector::_M_default_append
fish: Job 1, './tests/test-chat-template' terminated by signal SIGABRT (Abort)

test-chat-template

@ngxson
Copy link
Collaborator

ngxson commented Oct 26, 2024

You need to add an example of an unformatted chat template in test-chat-template. Please look carefully in the file.

@arch-btw
Copy link
Contributor Author

Hi @ngxson

Thank you again for your guidance!
I finally managed to get it to work! 😅

Here is the output of make tests/test-chat-template && ./tests/test-chat-template:


outputgranite


And here is the output of ./llama-cli -m granite-3.0-8B-instruct-F32-Q5_K_M.gguf -t 4 -p 'You are granite, an AI model.' --conversation --color --chat-template granite:

outputgranite2

And a prompt example:

outputgranite3

@arch-btw arch-btw requested a review from ngxson October 27, 2024 01:22
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes before I can merge

arch-btw and others added 2 commits October 28, 2024 04:20
@arch-btw
Copy link
Contributor Author

@ngxson Thank you again and sorry about that!

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, merging once the CI passes

@ngxson ngxson merged commit 61715d5 into ggml-org:master Oct 28, 2024
53 checks passed
gabe-l-hart added a commit to gabe-l-hart/llamafile that referenced this pull request Nov 4, 2024
Branch: GraniteThreeSupport

This is a port of the work done in llama.cpp with a slight tweak for the
tool call response:
ggml-org/llama.cpp#10013

Signed-off-by: Gabe Goodhart <[email protected]>
gabe-l-hart added a commit to gabe-l-hart/llamafile that referenced this pull request Nov 5, 2024
Branch: GraniteThreeSupport

This is a port of the work done in llama.cpp with a slight tweak for the
tool call response:
ggml-org/llama.cpp#10013

Signed-off-by: Gabe Goodhart <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* Add granite template to llama.cpp

* Add granite template to test-chat-template.cpp

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Update tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Added proper template and expected output

* Small change to \n

Small change to \n

* Add code space &

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Fix spacing

* Apply suggestions from code review

* Update src/llama.cpp

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* Add granite template to llama.cpp

* Add granite template to test-chat-template.cpp

* Update src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Update tests/test-chat-template.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Added proper template and expected output

* Small change to \n

Small change to \n

* Add code space &

Co-authored-by: Xuan Son Nguyen <[email protected]>

* Fix spacing

* Apply suggestions from code review

* Update src/llama.cpp

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
gabe-l-hart added a commit to gabe-l-hart/llamafile that referenced this pull request Dec 10, 2024
Branch: GraniteThreeSupport

This is a port of the work done in llama.cpp with a slight tweak for the
tool call response:
ggml-org/llama.cpp#10013

Signed-off-by: Gabe Goodhart <[email protected]>
gabe-l-hart added a commit to gabe-l-hart/llamafile that referenced this pull request Mar 14, 2025
Branch: GraniteThreeSupport

This is a port of the work done in llama.cpp with a slight tweak for the
tool call response:
ggml-org/llama.cpp#10013

Signed-off-by: Gabe Goodhart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants