llm-proxy (Go)

A lightweight HTTP proxy for LLMs. It exposes a single /v1/chat/completions endpoint and routes to OpenAI or Ollama based on the requested model. Includes API key auth and simple rate limiting.

Quick Start (local)

# 1) Clone (replace {user} after you fork)
git clone https://github.com/{user}/llm-proxy.git
cd llm-proxy

# 2) Make your module path yours (replace <your-gh-user>)
# macOS:
sed -i.bak 's#github.com/{user}/llm-proxy#github.com/<your-gh-user>/llm-proxy#g' \
  go.mod $(git ls-files "*.go")
# Linux:
# sed -i 's#github.com/{user}/llm-proxy#github.com/<your-gh-user>/llm-proxy#g' \
#   go.mod $(git ls-files "*.go")

# 3) Deps
go mod tidy

# 4) Run (Ollama example)
PORT=8081 API_KEYS=demo_123 OLLAMA_URL=http://localhost:11434 go run .

Health:

curl -s localhost:8081/healthz

Chat (Ollama; ensure Ollama is running and a model pulled, e.g., ollama pull llama3.1):

curl -s -X POST localhost:8081/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'X-API-Key: demo_123' \
  -d '{
    "model":"llama3.1",
    "messages":[{"role":"user","content":"Say hi in 3 words"}],
    "temperature":0.2
  }'

Chat (OpenAI; requires OPENAI_API_KEY):

OPENAI_API_KEY=sk-... PORT=8081 API_KEYS=demo_123 go run .

curl -s -X POST localhost:8081/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'X-API-Key: demo_123' \
  -d '{
    "model":"gpt-4o-mini",
    "messages":[{"role":"user","content":"Give me one fun fact"}]
  }'

Run in Docker

To-do: Enhance later

Environment variables

PORT=8081
API_KEYS=demo_123,admin_456     # comma-separated; required for /v1/chat/completions
OPENAI_API_KEY=                 # set to use OpenAI
OLLAMA_URL=http://localhost:11434
RATE_LIMIT_TOKENS_PER_MIN=60
RATE_LIMIT_BURST=60

Create .env.example:

PORT=8081
API_KEYS=demo_123
OPENAI_API_KEY=
OLLAMA_URL=http://localhost:11434
RATE_LIMIT_TOKENS_PER_MIN=60
RATE_LIMIT_BURST=60

API

Health:

GET /healthz → ok (no API key required)

Chat:

POST /v1/chat/completions
Header: X-API-Key: <your-key>

Body:

{
  "model": "llama3.1",
  "messages": [
    { "role": "user", "content": "Explain RAG in one sentence" }
  ],
  "temperature": 0.2
}

Response (OpenAI-like):

{
  "model": "llama3.1",
  "choices": [
    { "message": { "role": "assistant", "content": "..." } }
  ]
}

Routing rule: models that start with gpt- (and o* if you keep the sample) go to OpenAI; everything else goes to Ollama. See internal/providers/router.go.

Run/Debug in GoLand

Run → Edit Configurations → + Go Build

Run kind: Package
Package path: .
Working dir: project root

Env:

PORT=8081
API_KEYS=demo_123
OLLAMA_URL=http://localhost:11434
# OPENAI_API_KEY=sk-...  (optional)
RATE_LIMIT_TOKENS_PER_MIN=60
RATE_LIMIT_BURST=60

Set a breakpoint in the /v1/chat/completions handler, click Debug, then send a curl request.

Optional GoLand HTTP file requests.http:

### Health
GET http://localhost:8081/healthz

### Chat (Ollama)
POST http://localhost:8081/v1/chat/completions
Content-Type: application/json
X-API-Key: demo_123

{
  "model": "llama3.1",
  "messages": [{"role":"user","content":"name 3 colors"}],
  "temperature": 0.2
}

Notes

This repo uses a placeholder import/module path github.com/{user}/llm-proxy. Replace {user} with your GitHub username after you clone/fork.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
internal		internal
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-proxy (Go)

Quick Start (local)

Run in Docker

Environment variables

API

Run/Debug in GoLand

Notes

License

About

Uh oh!

Releases

Packages

Languages

MackCesar/llm-proxy

Folders and files

Latest commit

History

Repository files navigation

llm-proxy (Go)

Quick Start (local)

Run in Docker

Environment variables

API

Run/Debug in GoLand

Notes

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages