Thanks to visit codestin.com
Credit goes to vllora.dev

Skip to main content

Introduction

Debug your AI agents with complete visibility into every request. vLLora works out of the box with OpenAI-compatible endpoints, supports 300+ models with your own keys, and captures deep traces on latency, cost, and model output.

vLLora Tracing Interface

Installation

Easy install on Linux and macOS using Homebrew.

brew tap vllora/vllora 
brew install vllora

Launch vLLora:

vllora

For more installation options (Rust crate, build from source, etc.), see the Installation page.

Send your First Request

After starting vLLora, visit http://localhost:9091 to configure your API keys through the UI. Once configured, point your application to http://localhost:9090 as the base URL.

vLLora works as a drop-in replacement for the OpenAI API, so you can use any OpenAI-compatible client or SDK. Every request will be captured and visualized in the UI with full tracing details.

curl -X POST http://localhost:9090/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Hello, vLLora!"
}
]
}'

Now check http://localhost:9091 to see your first trace with full request details, costs, and timing breakdowns.