A simple FastAPI application that provides a REST API for interacting with Cerebras LLM using LangChain.
- Single
/chatendpoint for sending messages to Cerebras LLM - Support for custom models and API keys
- Health check endpoints
- Async support for better performance
- Install dependencies:
pip install -r requirements.txt- Set your Cerebras API key (optional if provided in requests):
export CEREBRAS_API_KEY="your-api-key-here"python main.pyThe server will start on http://localhost:8000
Send a message to the Cerebras LLM.
Request Body:
{
"message": "Hello, how are you?",
"model": "llama-3.1-8b-instruct",
"api_key": "your-api-key" // optional if set as environment variable
}Response:
{
"response": "I'm doing well, thank you for asking!",
"provider_info": {
"provider": "cerebras",
"model": "llama-3.1-8b-instruct",
"api_key_available": true
}
}Health check endpoint.
Detailed health check endpoint.
Once the server is running, visit:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
curl -X POST "http://localhost:8000/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What is the capital of France?",
"model": "llama-3.1-8b-instruct"
}'