-
Notifications
You must be signed in to change notification settings - Fork 11.8k
llama : Add IBM granite template #10013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Xuan Son Nguyen <[email protected]>
@ngxson thank you so much for your help, I appreciate your time and effort! I have applied your review 👍 I have a quick question, when I run I'm not sure what it should look like in the context of this script, considering the other prompts have things like additional spacing as well. Edit: sorry, I'm looking at the deepseek template. I guess mine failed:
|
You need to add an example of an unformatted chat template in |
Small change to \n
Hi @ngxson Thank you again for your guidance! Here is the output of And here is the output of And a prompt example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small changes before I can merge
Co-authored-by: Xuan Son Nguyen <[email protected]>
@ngxson Thank you again and sorry about that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, merging once the CI passes
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>
* Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <[email protected]> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
* Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <[email protected]> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <[email protected]>
Hi @ggerganov and @ngxson,
I'd like to contribute the IBM granite template to llama.cpp.
I've made my best effort but I'm new to this, so could you please be so kind to review? Maybe you can see if it's alright.
I've followed the wiki instructions but feel free to make changes or provide feedback on how to improve.
The model is here: https://huggingface.co/ibm-granite/granite-3.0-8b-instruct
Here is the current output of llama-cli:
--