-
Notifications
You must be signed in to change notification settings - Fork 104
Add initial Hugging Face support #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…gging face serverless endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! A few style comments below.
| resp <- chat$chat("What is 1 + 1?", echo = FALSE) | ||
| expect_match(resp, "2") | ||
| expect_equal(chat$last_turn()@tokens > 0, c(TRUE, TRUE)) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to add any other tests to verify that (eg.) tool calling works?
|
Thanks @hadley for the suggestions - I am going to put it into draft mode as I have found some issues with tool calling that I have not been able to resolve yet - do you have any suggestions for how best to debug this package? Many of the errors I am getting are many layers deep in a trace and not very helpful |
|
@s-spavound debugging tool calling is hard 😞 If you have reprexes, I'm happy to take a look and help out. |
|
I've generally better aligned |
|
Looks like tool calling is currently broken: huggingface/text-generation-inference#2986 |
This PR adds chat_hf() which is a light wrapper around the OpenAI provider to allow for Hugging Face serverless inference APIs.
It essentially just modifies the base URL to be compatible with Hugging Face requirements as well as the authentication step. I based it on how chat_github() has been constructed.
As Hugging Face supports multiple ways of hosting models, I expect that this will need to be further worked on to support all of these methods. For example it is a little tricky that certain models will not support system messages (even though they support the chat completions interface) - I don't know of any way (yet) to detect which methods are supported so its currently on the user.
An alternative I thought about was documenting that hugging face serverless can be done via chat_openai() as follows:
chat <- chat_openai(base_url = "https://api-inference.huggingface.co/models/meta-llama/Llama-3.1-8B-Instruct/v1",
api_key = HF_API_KEY,
model = "meta-llama/Llama-3.1-8B-Instruct")
Which does work.
I also fixed a tiny typo in one piece of documentation.
Please let me know if you were thinking about doing this a different way or want me to change anything else. Thanks!