Thanks to visit codestin.com
Credit goes to github.com

Skip to content

sipyourdrink-ltd/bernstein

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3,171 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Bernstein

"To achieve great things, two things are needed: a plan and not quite enough time." - Leonard Bernstein

why the name?

Bernstein is named after Leonard Bernstein, the American conductor and composer. The project orchestrates a crew of CLI coding agents the way Bernstein conducted the New York Philharmonic: every player on cue, the score deterministic, the conductor accountable for the result. He is the original orchestrator the project takes its name from.

deterministic multi-agent CLI orchestration

CI PyPI GHCR Python 3.12+ License OpenSSF Scorecard CodeQL Open in Codespaces

website · docs · install · first run · glossary · limitations · sponsor


Bernstein is a deterministic Python scheduler that runs a crew of CLI coding agents (Claude Code, Codex, Gemini CLI, and 40 more) against a single goal in parallel git worktrees, with an HMAC-signed audit chain over every step.

at a glance

  • 44 CLI agent adapters in v2.2.x: 41 third-party wrappers, 2 leaf-node delegators, plus a generic --prompt wrapper. Source of truth: the supported agents table below.
  • HMAC-SHA256 audit chain per RFC 2104, one record per scheduling decision, tamper-evident. Operator guide: docs/security/audit-log.md.
  • Bearer-token task server authenticates the manager and every worker. Per-session zero-trust JWT in .sdd/runtime/agent_tokens/, legacy BERNSTEIN_AUTH_TOKEN fallback, opt-out via BERNSTEIN_AUTH_DISABLED=1. Flow + diagnostics: docs/security/manager-auth.md.
  • Signed agent cards use detached JWS (RFC 7515 §A.5) over RFC 8785 (JCS) canonicalization, with Ed25519 / EdDSA keys. Code: src/bernstein/core/security/agent_card_signer.py.
  • Per-artefact lineage records every file write linked back to producer + inputs + prompt SHA + model + cost. CLI: bernstein lineage verify <run_id>.
  • Deterministic scheduler: zero LLM in the coordination loop. Plain Python decides who runs, where, with what budget. Replay yesterday's plan, get yesterday's task graph.

why this exists

i wrote bernstein because i was paying $400/month in claude bills running three coding agents in parallel and getting nondeterministic merges.

Apache 2.0, solo maintained. Live stats: bernstein.run.

install in 30 seconds

pipx install bernstein
bernstein init
bernstein run -g "fix the failing test in tests/test_foo.py"

See installed integrations: bernstein integrations list --installed.

sponsor

If Bernstein routed a model that saved you a Claude bill, $25 covers a month of my coffee.

github.com/sponsors/chernistry

who this is for

Specific shapes where the value lands:

  • engineering teams running >=3 CLI coding agents in parallel: each agent gets its own git worktree, the merge queue serialises landings, no race conditions
  • operators running compliance-sensitive workflows: every routing decision is plaintext, the audit log is HMAC-signed and tamper-evident, no SaaS hop, no third-party data plane
  • platform teams that need an audit log of agent decisions: the orchestrator writes one row per scheduling decision, you can grep it
  • anyone burning more than $1k/mo on coding agents who wants determinism: you can replay yesterday's plan and get yesterday's task graph
  • forward-deployed engineers dropping into a client repo: credentials stay in your env, not the client's; agents you spawn are whichever CLI tool the client already trusts

If you nodded at two of those bullets, this fits.

who this is NOT for

  • "I want one pair-programmer to chat with about my code": a single CLI agent is fine. Bernstein adds orchestration overhead you don't need.
  • prototypes where merge gates are overkill: the lint/types/tests/cross-model-review pipeline is value when the cost of a bad merge is real, friction when you're throwing the repo away on Friday.
  • non-coding tasks (research, writing, data analysis pipelines): Bernstein wraps CLI coding agents specifically, not generic LLM workflows.
  • anyone who wants a SaaS wrapper with a credit-card form: Bernstein is on-prem only by design.
  • teams that need a vendor with a support SLA and a contract: solo open-source project. GitHub issues are how support happens.
  • research-shape "let the agents collaborate emergently" use cases: the deterministic scheduler is a hard wall there.

how it compares

Closest neighbours in this category live in docs/compare/README.md. What Bernstein does well is the auditability surface: HMAC-chained audit, signed agent cards, per-artefact lineage, air-gap deploy profile, plus the widest CLI adapter coverage.


what is this, in one paragraph

You tell Bernstein what you want built. It splits the work across several AI coding agents, runs them in parallel inside isolated git worktrees, records every handoff in an HMAC-SHA256-chained audit log (RFC 2104), runs the tests, and merges the code that actually passes. File-based state (.sdd/), per-agent credential scoping, signed audit trail.

other install methods

curl -fsSL https://bernstein.run/install.sh | sh        # macOS / Linux one-liner
irm https://bernstein.run/install.ps1 | iex             # Windows PowerShell
pip install bernstein                                   # pip
uv tool install bernstein                               # uv
brew tap chernistry/tap && brew install bernstein       # Homebrew

See the full install matrix for dnf copr, npx, optional extras, and the wheelhouse path for air-gapped sites.

why the scheduler is plain Python

Most agent orchestrators use an LLM to decide who does what. That is non-deterministic and burns tokens on scheduling instead of code. Bernstein does one LLM call to break down your goal, then the rest (running agents in parallel, isolating their git branches, running tests, routing retries) is plain Python. Every run is reproducible. Every step is logged and replayable.

No framework to learn. No vendor lock-in. Swap any agent, any model, any provider.

Bernstein in action: parallel AI agents orchestrated in real time

What you see while it runs:

$ bernstein -g "Add JWT auth"
[manager] decomposed into 4 tasks
[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)
[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)
[verify]  all gates pass. merging to main.

YAML workflow manifests (optional)

When bernstein run -g "<goal>" is too coarse-grained, bernstein workflow runs a declarative DAG of agent / command / loop nodes. Manifests are plain YAML, validated up-front, dispatched through the same AgentSpawner the rest of Bernstein uses.

bernstein workflow list                          # bundled + user-installed
bernstein workflow run idea-to-pr -g "Add JWT auth"
bernstein workflow init my-flow                  # scaffold a starter manifest
bernstein workflow validate path/to/flow.yaml

Stock workflows shipping in the wheel: idea-to-pr, refactor-with-tests, security-review, doc-update, dependency-bump, hot-fix. Loop nodes re-fire until a bash predicate exits 0. fresh_context: true mints a new agent session per iteration. Per-step CLI/model routing: docs/workflows/per-step-routing.md.

use cases

  • forward-deployed engineering: drop the crew onto a client repo when you arrive, take it with you when you leave.
  • self-evolving projects: point Bernstein at its own repo and let it execute the backlog (this codebase is one).
  • CI fleets: run a crew of agents in parallel on PRs, with per-agent credential scoping and signed audit trail.
  • air-gapped deployment: install from a signed wheelhouse, run with --profile airgap to deny outbound by default. See Air-gap installation.

supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

44 CLI agent adapters: 41 third-party wrappers, 2 leaf-node delegators, plus a generic wrapper for anything with --prompt.

Agent Models Install
Claude Code Opus 4, Sonnet 4.6, Haiku 4.5 npm install -g @anthropic-ai/claude-code
Codex CLI GPT-5, GPT-5 mini npm install -g @openai/codex
OpenAI Agents SDK v2 GPT-5, GPT-5 mini, o4 pip install 'bernstein[openai]'
GitHub Copilot CLI Copilot-managed (GPT-5, Sonnet 4.6) npm install -g @github/copilot
Gemini CLI Gemini 2.5 Pro, Gemini Flash npm install -g @google/gemini-cli
Cursor Sonnet 4.6, Opus 4, GPT-5 Cursor app
Devin Terminal (Cognition) Devin-managed curl -fsSL https://cli.devin.ai/install.sh | bash then devin auth login
Aider Any OpenAI/Anthropic-compatible pip install aider-chat
Amp Amp-managed npm install -g @sourcegraph/amp
CLM gateway (sovereign / on-prem LLM) Any OpenAI-compatible CLM endpoint pip install aider-chat, then set CLM_ENDPOINT / CLM_TOKEN
Cody Sourcegraph-hosted npm install -g @sourcegraph/cody
Continue Any OpenAI/Anthropic-compatible npm install -g @continuedev/cli (binary: cn)
Goose Any provider Goose supports See Goose docs
IaC (Terraform/Pulumi) Any provider the base agent uses Built-in
Junie BYOK (Anthropic, OpenAI, Google, xAI, OpenRouter, Copilot) curl -fsSL https://junie.jetbrains.com/install.sh | bash
Kilo Kilo-hosted See Kilo docs
Kiro Kiro-hosted See Kiro docs
AWS Q Developer Amazon Q-managed (Claude-backed) brew install --cask amazon-q then q login
Ollama + Aider Local models (offline) brew install ollama
OpenCode Any provider OpenCode supports See OpenCode docs
Qwen Qwen Code models npm install -g @qwen-code/qwen-code
Cloudflare Agents Workers AI models bernstein cloud login
OpenHands Any LiteLLM-supported (Anthropic, OpenAI, ...) uv tool install openhands --python 3.12
Open Interpreter Any (LiteLLM-backed) pip install open-interpreter
gptme Anthropic, OpenAI, OpenRouter pipx install gptme
Plandex Plandex Cloud or self-hosted models curl -sL https://plandex.ai/install.sh | bash
AIChat OpenAI, Anthropic, OpenRouter, Groq, Gemini cargo install aichat
Letta Code Letta-routed (Anthropic, OpenAI) npm install -g @letta-ai/letta-code
Generic Any CLI with --prompt Built-in

Any adapter also works as the internal scheduler LLM:

internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...
internal_llm_model: gemini-3.1-pro

Tip

Run bernstein --headless for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

quick start

cd your-project
bernstein init                    # creates .sdd/ workspace + bernstein.yaml
bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit
bernstein live                    # watch progress in the TUI dashboard
bernstein stop                    # graceful shutdown with drain

For multi-stage projects, define a YAML plan:

bernstein run plan.yaml           # skips LLM planning, goes straight to execution
bernstein run --dry-run plan.yaml # preview tasks and estimated cost

web UI

v2.0.0 ships a minimal web UI (operator-requested; UI is a side surface, core orchestrator is the priority).

bernstein gui serve               # http://127.0.0.1:8052/ui/
bernstein gui serve --dev         # expects `npm run dev` on :5173
bernstein gui serve --minimal     # skip the full /api/v1/* surface

The Vite bundle is committed under src/bernstein/gui/static/, so wheel installs work without a Node toolchain. Surface tour + per-task drawer: docs/web-ui.md.

how it works

Bernstein runs a four-stage pipeline per goal:

  1. Decompose. The manager breaks your goal into tasks with roles, owned files, and completion signals. One LLM call, then plain Python from there.
  2. Spawn. Agents start in isolated git worktrees, one per task. Main branch stays clean.
  3. Verify. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.
  4. Merge. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible. Every step writes a record to the HMAC-chained audit log (.sdd/audit/YYYY-MM-DD.jsonl) per RFC 2104.

cloud execution (Cloudflare)

bernstein cloud runs agents on Cloudflare Workers with R2-backed workspace sync. See docs/cloudflare/.

bernstein cloud login      # authenticate with Bernstein Cloud
bernstein cloud deploy     # push agent workers
bernstein cloud run plan.yaml  # execute a plan on Cloudflare

capabilities

Bernstein ships parallel execution + worktree isolation + a janitor that gates merges on tests/lint/types, signed lineage records, MCP server mode, an HMAC-SHA256 audit chain, and 44 CLI adapters out of the box. Pluggable sandbox backends (worktree, Docker, E2B, Modal), pluggable artifact sinks (local, S3, GCS, Azure Blob, R2), progressive-disclosure skill packs, and a lethal-trifecta capability gate round it out.

Full feature matrix: docs/reference/FEATURE_MATRIX.md. Recent features: docs/whats-new.md.

regulatory anchors

Regulatory mappings (EU AI Act Article 12, SOC 2 CC4/CC7, DORA / NIS2, OWASP ASI06, RFC 2104/7515/8785/8037/7636/8707) live in docs/compliance/. These are mappings, not certifications.

operator commands

Highest-value commands; full list in docs/operations/commands.md.

Command What it does
bernstein pr Auto-creates a GitHub PR from a completed session; body carries the janitor's gate results and cost breakdown.
bernstein from-ticket <url> Imports a Linear / GitHub Issues / Jira ticket as a Bernstein task.
bernstein autofix Daemon that monitors open Bernstein PRs; spawns a fixer agent when CI fails.
bernstein hooks Lifecycle hooks (pre_task, post_task, pre_merge, etc.) as shell scripts or pluggy @hookimpls.
bernstein backlog claim --role reviewer Atomically claims one eligible row from .sdd/runtime/task-backlog.json for external workers.
bernstein chat serve --platform=telegram|discord|slack Drive runs from chat with /run, /status, /approve, /reject.
bernstein workflow run <name> Run a YAML workflow manifest.
bernstein schedule add|list|run Manage operator-registered recurring schedules; schedule audit walks persisted fire receipts to prove the sequence is replayable.

retrieval & caching: what's actually under the hood

Bernstein deliberately uses no neural embeddings, no vector databases, and no external embedding APIs. There are two retrieval/caching layers, both keyword/lexical:

  • Codebase RAG (core/knowledge/rag.py): SQLite FTS5 with BM25 ranking and AST-aware chunking for Python files.
  • Semantic cache (core/knowledge/semantic_cache.py): TF (term-frequency) cosine similarity over word counts, not learned embeddings.

If you need real semantic retrieval (vector DB, neural embeddings), wire it yourself via the retrieval role/skill in templates/; nothing in core performs vector search.

install

Method Command
One-liner (macOS / Linux) curl -fsSL https://bernstein.run/install.sh | sh
One-liner (Windows) irm https://bernstein.run/install.ps1 | iex
pip pip install bernstein
pipx pipx install bernstein
uv uv tool install bernstein
Homebrew brew tap chernistry/tap && brew install bernstein
Fedora / RHEL sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein
npm (wrapper) npx bernstein-orchestrator
Docker (GHCR) docker run --rm -v "$PWD:/work" -w /work -e ANTHROPIC_API_KEY ghcr.io/sipyourdrink-ltd/bernstein:latest run -g "fix tests/test_foo.py"

The one-liner scripts check for Python 3.12+, bootstrap pipx when it's missing, fix PATH for the current session, and install (or upgrade) bernstein. Script sources: install.sh · install.ps1.

optional extras

Provider SDKs are optional so the base install stays lean.

Extra Enables
bernstein[openai] OpenAI Agents SDK v2 adapter (openai_agents)
bernstein[docker] Docker sandbox backend
bernstein[e2b] E2B microVM sandbox backend (needs E2B_API_KEY)
bernstein[modal] Modal sandbox backend, optional GPU (needs MODAL_TOKEN_ID / MODAL_TOKEN_SECRET)
bernstein[s3] S3 artifact sink (via boto3)
bernstein[gcs] Google Cloud Storage artifact sink
bernstein[azure] Azure Blob artifact sink
bernstein[r2] Cloudflare R2 artifact sink (S3-compatible boto3)
bernstein[grpc] gRPC bridge
bernstein[k8s] Kubernetes integrations

Combine extras with brackets, e.g. pip install 'bernstein[openai,docker,s3]'.

Editor extensions: VS Marketplace · Open VSX

security

  • OpenSSF Scorecard. Weekly run via .github/workflows/scorecard.yml. Results uploaded to GitHub Code Scanning. Badge above.
  • Fuzzing. ClusterFuzzLite config at .clusterfuzzlite/ plus a cifuzz-pr workflow (.github/workflows/cifuzz-pr.yml) provide an OSSF-recognized fuzzing harness on top of the existing Hypothesis property-test suite.
  • Vulnerability disclosure. See SECURITY.md.

contributing

PRs welcome. See CONTRIBUTING.md for setup and code style.

support

If Bernstein saves you time: GitHub Sponsors.

Contact: [email protected].

featured in

More awesome-lists, MCP catalogs, and prior-art citations

Awesome lists: Jenqyang/Awesome-AI-Agents, jamesmurdza/awesome-ai-devtools, jim-schwoebel/awesome_ai_agents, Piebald-AI/awesome-gemini-cli, ComposioHQ/awesome-codex-skills, punkpeye/awesome-mcp-servers, jxzhangjhu/Awesome-LLM-RAG, rohitg00/awesome-claude-code-toolkit, numtide/llm-agents.nix, andyrewlee/awesome-agent-orchestrators, bradAGI/awesome-cli-coding-agents, milisp/awesome-codex-cli, yaolifeng0629/Awesome-independent-tools, caramaschiHG/awesome-ai-agents-2026, ai-for-developers/awesome-vibe-coding, taishi-i/awesome-ChatGPT-repositories, eudk/awesome-ai-tools, killop/anything_about_game, vinta/awesome-python, Zijian-Ni/awesome-ai-agents-2026, rohitg00/awesome-devops-mcp-servers, Glama MCP Catalog. Mirrors: icopy-site/awesome, icopy-site/awesome-cn, trackawesomelist/trackawesomelist.

Prior-art citations by peer projects: mkb23/overcode, Vintersong/NOVA-Cognition-Framework, AJV009/drupal-contrib-workbench, danielvaughan/codex-blog.

Directories: AlternativeTo.

cite

Machine-readable metadata lives in CITATION.cff (CFF 1.2.0); GitHub renders the "Cite this repository" button automatically. A Zenodo DOI will be minted on the next release.

license

Apache License 2.0


Alex Chernysh · GitHub · X · bernstein.run

Translations available in 11 languages: see docs/i18n/.

About

Audit-grade multi-agent orchestration for CLI coding agents (Claude Code, Codex, Gemini CLI, +40 more). HMAC-chained audit log, signed agent cards, per-artefact lineage, air-gap deploy. The orchestrator your compliance team will sign off on. https://bernstein.run

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

  •  

Packages

 
 
 

Contributors