A simpler self-hosted alternative to Open WebUI. Bring your own model — Anthropic, Google Gemini, or any OpenAI-compatible endpoint. One docker compose up and you're in.
demo.mp4
Open WebUI is an impressive project, but every time I tried to actually live in it something got in the way. The browser tab would peg CPU and balloon past a gig on long replies — the streaming pipeline re-broadcasts the entire growing message body on every token, so a long chat is O(N²) in bytes (#23733, still open). Pasting any sizeable chunk of text would freeze the page for seconds (#12087, still open). The v0.9 release line shipped a run of migration regressions where you'd docker pull and then have to docker exec into the container and hand-edit alembic scripts before the app would boot.
And underneath all that, the UI just feels heavy. Settings pages full of toggles for features I'd never use. Web search that wants its own API key. TTS that wants its own setup. A hundred surfaces in front of a single text box.
I wanted a chat app I could open and use. So I wrote one.
- One process, one file. A Next.js app and a SQLite file. No Postgres, no Redis, no Celery, no separate API service. Schema migrations are one Drizzle command on container boot.
- 600 MB Docker image vs Open WebUI's 1.7 GB (compressed, amd64, pulled from
ghcr.ioon 2026-05-20). About a third the size on disk, fewer layers. - No plugin runtime, no pipelines, no functions framework. Tools are two AI SDK definitions in
lib/tools.ts:web_search(SearXNG) andfetch_url(Defuddle → markdown). That's the whole extensibility surface. - No RAG, no embeddings, no vector DB. Chat search is SQLite FTS5 + BM25, populated by triggers (
lib/db/search.ts). Web search results go straight into context as JSON. - One LLM client for everything. Anthropic, Google Gemini, OpenAI, Groq, OpenRouter, xAI, Mistral, DeepSeek, Together, Ollama, vLLM, llama.cpp — all through
@ai-sdk/openai-compatible.
If you want every feature in the world — image generation, a code interpreter, knowledge graphs, a plugin marketplace — use Open WebUI or LibreChat. If you want a chat app that opens in under a second and stays out of your way, this is that.
git clone https://github.com/yoloyash/overtchat
cd overtchat
cp .env.example .env
echo "BETTER_AUTH_SECRET=$(openssl rand -hex 32)" >> .env
echo "SEARXNG_SECRET=$(openssl rand -hex 32)" >> .env
docker compose up -d --buildOpen http://localhost:4718, sign up, the setup wizard takes you the rest of the way.
LAN access: set BETTER_AUTH_URL=http://<your-lan-ip>:4718 in .env, then docker compose up -d.
Internet access: uncomment the cloudflared block in compose.yml and paste a tunnel token.
- Multi-user auth, first signup becomes admin
- Persistent chat history, auto-titled, full-text searchable
- File uploads — images, PDFs, Word, Excel, CSV, source code
- Projects with per-project system prompts
- Web search via bundled SearXNG. No API key.
- Text-to-speech via bundled Kokoro. No setup.
- Speech-to-text via Parakeet TDT v3 (opt-in, CPU or NVIDIA GPU)
- Chat export (JSON / Markdown)
Off by default — the mic button in the composer is greyed out until you bring up the Parakeet sidecar:
docker compose --profile stt up -d # CPU (~670 MB model, ~2 GB RAM)
docker compose --profile stt-gpu up -d # NVIDIA GPU (requires NVIDIA Container Toolkit)Multilingual (25 languages, auto-detected). All processing stays on your machine. Model downloads on first start (~10 s) and is cached in a Docker volume.
No telemetry, no analytics. No external calls except the LLM endpoint you configure and the bundled SearXNG (which runs on your machine). All data lives in a single SQLite file you can copy, back up, or delete.
- Docker + Docker Compose v2
- ~1 GB RAM free for the app stack (Kokoro TTS pulls ~100 MB on first boot)
- An LLM endpoint (API key or self-hosted)
Next.js 16 · Vercel AI SDK v6 · Better Auth · Drizzle + SQLite · base-ui · Tailwind · SearXNG · Kokoro TTS
- docs/deploy.md — updates, backup, troubleshooting
MIT. Fork it, white-label it, ship it. No branding clauses to negotiate around.