Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

ggml-org/LlamaBarn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LlamaBarn

LlamaBarn is a tiny menu bar app for running local LLMs.


LlamaBarn


Install

Install with brew install --cask llamabarn or download from Releases.

How it works

LlamaBarn runs a local server at http://localhost:2276/v1.

  • Install models — from the built-in catalog
  • Connect any app — chat UIs, editors, CLI tools, scripts
  • Models load when requested — and unload when idle

Features

  • 100% local — Models run on your device; no data leaves your Mac
  • Small footprint12 MB native macOS app
  • Zero configuration — models are auto-configured with optimal settings for your Mac
  • Smart model catalog — shows what fits your Mac, with quantized fallbacks for what doesn't
  • Self-contained — all models and config stored in ~/.llamabarn
  • Built on llama.cpp — from the GGML org, developed alongside llama.cpp

Works with

LlamaBarn works with any OpenAI-compatible client.

  • Chat UIs — Chatbox, Open WebUI, BoltAI (instructions)
  • Editors — VS Code, Zed, Xcode (instructions)
  • Editor extensions — Cline, Continue
  • CLI tools — OpenCode (instructions), Claude Code (instructions)
  • Custom scripts — curl, AI SDK, etc.

You can also use the built-in WebUI at http://localhost:2276 while LlamaBarn is running.

API examples

# list installed models
curl http://localhost:2276/v1/models
# chat with Gemma 3 4B (assuming it's installed)
curl http://localhost:2276/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemma-3-4b", "messages": [{"role": "user", "content": "Hello"}]}'

Replace gemma-3-4b with any model ID from http://localhost:2276/v1/models.

See complete API reference in llama-server docs.

Experimental settings

Expose to network — By default, the server is only accessible from your Mac (localhost). This option allows connections from other devices on your local network. Only enable this if you understand the security risks.

# bind to all interfaces (0.0.0.0)
defaults write app.llamabarn.LlamaBarn exposeToNetwork -bool YES

# or bind to a specific IP (e.g., for Tailscale)
defaults write app.llamabarn.LlamaBarn exposeToNetwork -string "100.x.x.x"

# disable (default)
defaults delete app.llamabarn.LlamaBarn exposeToNetwork

Roadmap

  • Support for adding models outside the built-in catalog
  • Support for loading multiple models at the same time
  • Support for multiple configurations per model (e.g., multiple context lengths)