Thanks to visit codestin.com
Credit goes to bernstein.run

Skip to main content
v2.6.0 shipping46 cli adapters · apache 2.0 · on-prem · 136,000+ installs

several coding agents. one git tree. only what passes.

bernstein is a deterministic python scheduler that runs cli coding agents in parallel. no llm in the loop.

ships adapters for claude code, codex, gemini cli, aider, and 42 more. each runs in its own git worktree. lint, types, and tests gate every merge.

ask the docsgrounded in source + 14 postscited

ask anything.

how it works

How does Bernstein work?

Bernstein is an open-source orchestrator for CLI coding agents. It decomposes a goal into tasks, spawns Claude Code, Codex, Gemini CLI and 43 other agents into isolated git worktrees, runs each task in parallel, then verifies the output through lint, type checks, tests, and an optional cross-model review before merging. The scheduler is plain Python - deterministic, replayable from an HMAC-chained audit log, no LLM tokens spent on coordination.

one run, four stages.

01
decompose
manager → tasks · roles · signals
02
spawn
agents → isolated worktrees
03
verify
janitor → tests · types · lint
04
merge
only verified diffs land

evidence, not vibes

every step signed, in order, on disk.

bernstein writes an hmac-signed event chain to .bernstein/audit.log. each entry references the previous hash. tampering breaks verification. nothing leaves your machine.

this is the artifact security review actually wants. not a screenshot, not a SOC2 PDF - a hash chain you can replay.

frequently asked

the four questions that block install.

is the scheduler an llm?

the core loop is plain python. who runs, who's blocked, what merges is deterministic and replayable. agents are llms, model selection is llm-assisted (capability router + recommender), best-of-n picks a winner via llm judge. all of those are opt-in and pluggable, so if you'd rather have the planner decide via llm too, you wire your own through the routing layer. just don't put it in the scheduler tick.

does it phone home?

nothing leaves your machine without your config. opt-in telemetry is full and audit-grade when you turn it on: hmac-chained run trail, per-task tool calls, model usage, token cost, latency percentiles. ship it to your own otel collector, datadog, splunk, s3 bucket. defaults to local-only because the on-prem audience wants that, but the enterprise hooks are there.

where does it run?

wherever you point it. your laptop, on-prem behind a firewall, cloudflare workers as the cloud runtime, kubernetes as a multi-node cluster, or a hybrid of those. sandbox-execution mode is supported (cloudflare sandbox, local docker). your repo is the input, your tests are the gate; bernstein adapts to the host. nothing forces a saas hop.

how is this different from claude code?

claude code can spawn sub-agents on its own; bernstein does the same thing across 40+ different cli agents at once and verifies their output against your tests instead of trusting it. claude code is the most common primary backend inside bernstein - using one does not exclude the other.

one engineering post a month.

what we shipped, what broke, what we learned. one click to unsubscribe.