v2.6.0 shipping46 cli adapters · apache 2.0 · on-prem · 136,000+ installs

several coding agents. one git tree. only what passes.

bernstein is a deterministic python scheduler that runs cli coding agents in parallel. no llm in the loop.

ships adapters for claude code, codex, gemini cli, aider, and 42 more. each runs in its own git worktree. lint, types, and tests gate every merge.

ask the docsgrounded in source + 14 postscited

ask anything.

how it works

How does Bernstein work?

Bernstein is an open-source orchestrator for CLI coding agents. It decomposes a goal into tasks, spawns Claude Code, Codex, Gemini CLI and 43 other agents into isolated git worktrees, runs each task in parallel, then verifies the output through lint, type checks, tests, and an optional cross-model review before merging. The scheduler is plain Python - deterministic, replayable from an HMAC-chained audit log, no LLM tokens spent on coordination.

one run, four stages.

decompose

manager → tasks · roles · signals

spawn

agents → isolated worktrees

verify

janitor → tests · types · lint

merge

only verified diffs land

from the blog

three pieces from the blog.

field notes from the orchestra pit. hand-picked: where bernstein sits in the multi-agent coding category, what it looks like in the cloud, and how it started.

May 20, 202618 min read

bernstein 2.x recap: lineage, ten trackers, A2A capability cards, and a CI that started fixing itself

Thirteen releases since the 1.10 recap consolidated into nine themes: a per-artefact transparency log with Ed25519 signatures, ten tracker adapters from Jira to Plane, A2A capability cards, MCP client and server hardening, a Playwright sandbox for UI agents, a secrets broker, supply-chain coverage with SBOM and OSSF Scorecard, calibrated cost guards, and a web UI plus PWA in the wheel.

multi-agent orchestrationreleaselineageaudit log

Apr 14, 20263 min read

agents on cloudflare: workers, durable objects, r2, d1

bernstein 1.8.4 cloudflare backend for ai coding agents: workers run agents, durable workflows handle multi-step tasks, r2 + d1 hold state.

Cloudflare Workerscloud AI agentsserverless orchestrationmulti-agent orchestration

Mar 29, 20262 min read

bernstein 1.0: open-source orchestrator for ai coding agents

Orchestrate Claude Code, Codex, Gemini CLI + 40 other CLI coding agents in parallel git worktrees. Deterministic scheduler, HMAC-signed audit chain.

multi-agent orchestrationAI coding agentsClaude Codeopen-source

evidence, not vibes

every step signed, in order, on disk.

bernstein writes an hmac-signed event chain to .bernstein/audit.log. each entry references the previous hash. tampering breaks verification. nothing leaves your machine.

this is the artifact security review actually wants. not a screenshot, not a SOC2 PDF - a hash chain you can replay.

~/proj $ bernstein audit verify

# reading .bernstein/audit.log · 412 entries · 18m04s span

17:42:01  manager.plan     cb84a1…  13 tasks · est $4.20
17:42:18  scheduler.spawn  3e09f7…  t-001 → backend (sonnet)
17:42:18  scheduler.spawn  81c2dd…  t-002 → backend (sonnet)
17:42:19  scheduler.spawn  b50af8…  t-003 → qa      (opus)
17:55:11  janitor.verify   d014a3…  t-005 docs · pytest 84/84 · ruff 0
17:55:14  manager.merge    ee7c12…  t-005 → main · 4 files · clean
18:02:14  janitor.fail     9aa10b…  t-003 qa · pytest collect err · 3/3 retries
18:02:30  scheduler.route  71b4cc…  t-003 opus → sonnet · awaiting op
18:09:14  scheduler.pause  4d2e09…  conflict · t-001 ↔ t-007 · src/auth/refresh.py

verify chain      ok ✓  412/412 entries · last hash 4d2e09cb…
signed by         ~/.bernstein/keys/audit.ed25519

frequently asked

the four questions that block install.

is the scheduler an llm?

the core loop is plain python. who runs, who's blocked, what merges is deterministic and replayable. agents are llms, model selection is llm-assisted (capability router + recommender), best-of-n picks a winner via llm judge. all of those are opt-in and pluggable, so if you'd rather have the planner decide via llm too, you wire your own through the routing layer. just don't put it in the scheduler tick.

does it phone home?

nothing leaves your machine without your config. opt-in telemetry is full and audit-grade when you turn it on: hmac-chained run trail, per-task tool calls, model usage, token cost, latency percentiles. ship it to your own otel collector, datadog, splunk, s3 bucket. defaults to local-only because the on-prem audience wants that, but the enterprise hooks are there.

where does it run?

wherever you point it. your laptop, on-prem behind a firewall, cloudflare workers as the cloud runtime, kubernetes as a multi-node cluster, or a hybrid of those. sandbox-execution mode is supported (cloudflare sandbox, local docker). your repo is the input, your tests are the gate; bernstein adapts to the host. nothing forces a saas hop.

how is this different from claude code?

claude code can spawn sub-agents on its own; bernstein does the same thing across 40+ different cli agents at once and verifies their output against your tests instead of trusting it. claude code is the most common primary backend inside bernstein - using one does not exclude the other.

one engineering post a month.

what we shipped, what broke, what we learned. one click to unsubscribe.