I build software in the open.
Hi, I am Sarma. I run an open-source studio from a desk in the UK. Nineteen MIT-licensed projects, eighty-seven long-form essays, and three flagships in active development. Everything I build is on GitHub with a whitepaper, an architecture diagram, and a quick-start guide.

I build software in the open.
I am Sarma, a software engineer based in the UK. I work on LLM infrastructure, coding agents, inference servers, storage engines, consensus protocols and the platform tools that hold all of it together.
What pulls me back to the desk every weekend is the same thing that pulled me into the industry, the quiet thrill of building something from scratch. A blank repository, a problem worth solving, a system that did not exist yesterday and ships today. Hours go past unnoticed.
Everything I build lives in the open at github.com/sarmakska, MIT licensed, with a whitepaper, an architecture diagram and a quick-start guide per project.
Connect on LinkedIn
Built. Documented. Shipped. All MIT.
Sarmalink-AI
Multi-provider AI gateway
OpenAI-compatible. Thirty-six engines across seven providers. When the primary returns 429, the next engine fires in under fifty milliseconds. Intent auto-router, MCP-shape tool catalog, persistent memory, FLUX image generation, TTS and STT cascades.
slipstream
Claude Code plugin + cross-IDE MCP toolkit
Fourteen sp_* tools replace whole-file reads with scoped symbol pulls, reproducible ~95% per-read savings via pnpm benchmark. React + Vite + d3 dashboard with nine routed views. Cross-tab agent bus. 75-skill methodology library. Six editor install paths.
echo
The open Jarvis you actually own
Bring-your-own-subscription. Never an API key. Dispatches each prompt to whichever subscription-backed CLI you already pay for, claude / codex / gemini. Voice in, voice out, vision when it helps. Multi-monitor HUD. One Rust core. Local-first. MIT.
What I am shipping right now.
I write in public and ship in public. This is what is on the desk this week, end of the day on the day you opened this page.
Code that survives the six-month test
Code that works in a demo and dies in production is technical theatre. I optimise for the moment six months after launch, when somebody else has to read it, change it, and own the on-call page when something breaks.
Operating principles
Boring tech, surgical complexity. Postgres before Mongo. Server-rendered HTML before another SPA framework. Reach for the exotic only when the boring option genuinely runs out.
Open source by default. Nineteen public repositories under MIT, covering coding agents, gateways, inference, storage engines, consensus, and sandboxes. If a piece of work is generally useful, I publish it.
Numbers over narratives. Every blog post that claims a benchmark cites the source. Every chart marks whether the row is from a public benchmark or my own. A transparency footer on every post invites readers to flag bad numbers.
Ship the smallest thing that proves the next thing. Small commits, frequent deploys, observability before features. Big-bang releases are how products get cancelled mid-flight.
Defaults that respect the user. No silent analytics. No cookie banners I would hate. Real auth on day one, real row-level security on every table. The defaults you would want if you cloned my code at midnight.
Twelve lanes, nineteen repositories
Not abstract capability lists. Each card below maps to repositories you can clone, read, and run today.
Multi-provider AI gateways
SarmaLink-AI: multi-engine failover across fourteen providers, OpenAI-compatible proxy, persistent memory, image generation, live tools. Zero-cost frontier tier as the default route.
Agent orchestration
Durable multi-agent workflows with deterministic replay, journaled state in Postgres, hard tool and token budgets, BullMQ queue, Inspector UI. Workflows that survive restarts and pass audit.
Real-time voice loops
Sub-second WebRTC voice agent with mediasoup SFU, pluggable STT, LLM and TTS adapters, explicit turn-state machine, barge-in cancellation tested across the awkward cases.
Evals as code
Datasets as files, scorers as functions, traces in DuckDB, viewer in HTMX. Six built-in scorers including LLM-as-judge. Regression mode fails CI when a release loses ground against the baseline.
Production infrastructure
Helm charts for Next.js with the full observability stack (Prometheus, Grafana, Loki, Alertmanager) preconfigured. Terraform stack composing Vercel, Supabase, Cloudflare, DigitalOcean.
RAG and document intelligence
A clean end-to-end RAG starter you can clone, run, and ship in ten minutes. PDF chunking, embeddings, cosine retrieval, streaming answers with citations. Receipt OCR with Zod-validated JSON output.
Inference internals
forge-infer: a minimal LLM inference server in Python with paged KV-cache, continuous batching, and speculative decoding. Built to understand the layer that the SDKs hide.
Storage and consensus
lsmdb: a log-structured merge-tree engine in Go with WAL, SSTables, bloom filters, MVCC snapshots. raftkv: a Raft key-value store with a fault-injection harness proving linearizability under partitions.
Sandboxing and isolation
sandboxd: a WebAssembly sandbox in Rust with a deny-by-default host ABI and strict CPU, wall-clock, and memory limits. Built for the moment somebody hands an agent untrusted code.
Observability that pays its rent
Structured logs, RED metrics, exemplar-linked traces, dashboards that diagnose rather than decorate. If a graph never gets opened during an incident, it does not deserve to exist.
Multi-tenant SaaS plumbing
shipyard: tenant isolation by row and by schema, RBAC, audit log, billing hooks, rate limits. The boring foundation under every B2B product, ready to clone.
Developer-grade tooling
slipstream: a token-efficient coding agent with persistent memory and a live local dashboard. The tool I use on myself before I ship it to anybody else.