Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
432 lines (388 loc) · 27.8 KB

File metadata and controls

432 lines (388 loc) · 27.8 KB

bernstein - CLAUDE.md

Overview

Bernstein is named after Leonard Bernstein, the American conductor and composer. The project orchestrates a crew of CLI coding agents the way Bernstein conducted the New York Philharmonic: every player on cue, the score deterministic, the conductor accountable for the result. He is the original orchestrator the project takes its name from.

Module map

src/bernstein/core/ - orchestration engine

File Purpose
compat_redirect_ledger.py Compatibility ledger for legacy bernstein.core redirects
credential_scoping.py Agent credential scope minimization for least-privilege API keys
dataclass_helpers.py Helpers for preserving dataclass instance types through updates
defaults.py Centralized default values for the Bernstein orchestrator
streaming_merge.py Streaming task results for long-running agents (incremental merge)
agents/ agents sub-package
approval/ Interactive tool-call approval (op-002)
autofix/ Bernstein autofix daemon - auto-repair CI failures on Bernstein PRs
autoheal/ Auto-heal v2 subpackage
chat/ Chat-control bridges for driving Bernstein agents from messaging apps
communication/ communication sub-package
compliance/ Compliance subpackage
config/ Config: seed parsing, config management, settings, feature gates
cost/ cost sub-package
daemon/ Daemon installation helpers for Bernstein
devops/ devops sub-package
distribution/ Distribution utilities - air-gap wheelhouse build, verify, signing
errors/ Structured first-run error categorization for Bernstein
fleet/ Fleet dashboard - supervise multiple Bernstein projects in one view
git/ git sub-package
grpc_gen/ Generated gRPC stubs - run scripts/generate_proto.sh to populate
handoff/ Session handoff between terminal and chat/dashboard surfaces (op-005)
identity/ Install-rev identity module - passive, operator-decodable install fingerprint
integrations/ Integrations sub-package
interop/ Cross-organisation interoperability surfaces
knowledge/ knowledge sub-package
lifecycle/ Lifecycle-hooks subsystem
lineage/ Lineage v1 - Sigstore-style per-artefact transparency log
memory/ memory sub-package - persistent memory stores
notifications/ Outbound notification subsystem (release 1.9)
observability/ observability sub-package
orchestration/ orchestration sub-package
persistence/ persistence sub-package
planning/ planning sub-package
plugins_core/ plugins_core sub-package
preview/ bernstein preview - sandboxed dev-server with public tunnel link
protocols/ protocols sub-package
quality/ quality sub-package
replay/ Deterministic replay package for Bernstein agent runs
review/ Per-adapter perspective assignment and chain coordination for reviews
review_responder/ PR review responder - react to inline review comments on Bernstein PRs
routes/ FastAPI router modules for the Bernstein task server
routing/ routing sub-package
sandbox/ Pluggable sandbox backends for agent isolation (oai-002 phase 1)
security/ security sub-package
server/ server sub-package - re-exports for backward compatibility
sessions/ Session-level orchestration primitives that span multiple subsystems
simulate/ Digital-twin orchestration simulator (issue #1374)
skills/ Progressive-disclosure skill packs (oai-004)
storage/ Pluggable artifact storage sinks (oai-003)
substrate/ Substrate: register Bernstein into host applications (MCP servers, etc.)
tasks/ tasks sub-package
telemetry/ Opt-in operator observability for Bernstein
tokens/ tokens sub-package
trackers/ Tracker adapter subsystem
trigger_sources/ Trigger source adapters - normalize raw events into TriggerEvent
tunnels/ Tunnel provider abstraction and registry
workflows/ Archon-inspired YAML workflow manifests
worktrees/ Worktree inventory and garbage-collection helpers

src/bernstein/adapters/ - CLI agent adapters

File Purpose
_contract.py Adapter contract loader and capability checker
aichat.py AIChat CLI adapter
aider.py Aider CLI adapter
amp.py Amp CLI adapter
auggie.py Auggie (Augment Code) CLI adapter
autohand.py Autohand Code CLI adapter
base.py Base adapter for CLI coding agents
caching_adapter.py Caching wrapper for CLI adapters to enable prompt prefix deduplication and response reuse
charm.py Charm Crush CLI adapter
claude.py Claude Code CLI adapter
claude_agents.py Build per-task Claude Code subagent definitions for the --agents flag
claude_cache_control.py Anthropic API cache-control block builder for the Claude Code adapter
claude_exit_codes.py Map Claude Code exit codes to Bernstein AbortReason/TransitionReason
claude_mcp_loader.py MCP config loading and merging for the Claude Code adapter
claude_routine.py Claude Code Routine adapter - offloads tasks to Anthropic cloud via /fire API
claude_stream_parser.py Parse Claude Code --output-format stream-json events
claude_wrapper_script.py Inline wrapper script source + assembly for the Claude Code adapter
cline.py Cline CLI adapter
clm.py CLM sovereign LLM adapter - drives a customer-side CLM gateway
clm_tls_launcher.py mTLS launcher for the CLM adapter
cloudflare_agents.py Cloudflare Agents SDK adapter for local wrangler dev or deployed worker trigger
codebuff.py Codebuff CLI adapter
codex.py OpenAI Codex CLI adapter
codex_cloudflare.py Codex adapter for Cloudflare Sandbox execution
cody.py Sourcegraph Cody CLI adapter
composio.py Composio Agent Orchestrator (ao) CLI adapter
conformance.py Adapter tool contract conformance suite harness
continue_dev.py Continue.dev CLI adapter
copilot.py GitHub Copilot CLI adapter
cursor.py Cursor Agent CLI adapter
devin_terminal.py Devin for Terminal (Cognition) CLI adapter
droid.py Droid (Factory AI) CLI adapter
env_isolation.py Environment variable isolation for spawned agents
forge.py Forge CLI adapter
gemini.py Google Gemini / Antigravity CLI adapter
generic.py Generic CLI adapter for arbitrary coding agent CLIs
goose.py Goose CLI adapter for Bernstein
gptme.py gptme CLI adapter
hermes.py Hermes Agent (Nous Research) CLI adapter
iac.py Infrastructure-as-Code (Terraform/Pulumi) adapter for Bernstein
junie.py JetBrains Junie CLI adapter
kilo.py Kilo CLI adapter (Stackblitz)
kimi.py Kimi CLI adapter
kiro.py Kiro CLI adapter
letta_code.py Letta Code CLI adapter
manager.py
mistral.py Mistral Vibe CLI adapter
mock.py Mock CLI adapter for zero-API-key demos and testing
ollama.py Ollama / OpenAI-compatible local LLM adapter - run coding agents without cloud API keys
open_interpreter.py Open Interpreter CLI adapter
openai_agents.py OpenAI Agents SDK v2 adapter
openai_agents_runner.py Python entrypoint that runs an OpenAI Agents SDK session
opencode.py OpenCode CLI adapter
openhands.py OpenHands CLI adapter
pi.py Pi (pi-coding-agent) CLI adapter
plandex.py Plandex CLI adapter
plugin_sdk.py Adapter plugin SDK for third-party agent integration
q_dev.py AWS Q Developer CLI adapter (binary: q)
qwen.py Qwen CLI adapter for OpenAI compatible models
ralphex.py Ralphex (umputun/ralphex) CLI adapter
registry.py Adapter registry - look up CLI adapters by name
report.py Adapter conformance + capability report
rovo.py Atlassian Rovo Dev CLI adapter
session_id.py Deterministic session-id binding for adapter replay isolation
skills_injector.py Inject per-task Claude Code skills into the worktree before spawn
strict_schema.py Strict structured-output validation and user-owned-field protection
use_cases.py Per-adapter metadata for the bernstein integrations list command
ci/ CI system adapters for log parsing and failure extraction

src/bernstein/agents/ - agent catalog & discovery

File Purpose
agency_provider.py AgencyProvider - loads CatalogAgent instances from msitarzewski/agency-agents format
catalog.py Agent catalog registry - loads agent definitions from external sources
discovery.py Agent directory auto-discovery for Bernstein
registry.py Dynamic agent registry with YAML-based definitions and hot-reload support

src/bernstein/cli/ - Click CLI

File Purpose
api_warmup.py API preconnect warmup -- send a minimal request to warm provider connections
badge.py Powered-by badge helper for bernstein init --add-badge
commit_stats.py Commit attribution stats: gather per-role commit stats via git log
dashboard.py Bernstein TUI -- retro-futuristic agent orchestration dashboard
dashboard_actions.py Dashboard side panels and expert views
dashboard_app.py Bernstein TUI application -- main App class and entry point
dashboard_header.py Dashboard header and agent display widgets
dashboard_polling.py Dashboard polling helpers, data loaders, formatters, and constants
debug_bundle.py Debug bundle export -- bernstein debug bundle
first_run_guard.py First-run guard helpers wiring categorisation into CLI entry points
helpers.py Shared constants, helpers, and utilities for Bernstein CLI modules
install_check.py Installation mismatch detection -- detect multiple Bernstein installs and config conflicts
keybindings.py Keybinding system for the Bernstein TUI
live.py Live view helpers for bernstein live --classic
main.py CLI entry point for Bernstein -- declarative agent orchestration
notebook_traces.py Notebook-aware traces - detect and track Jupyter notebook cell edits
release_notes.py Release notes display - fetch and format CHANGELOG.md for terminal output
run.py Enhanced run output for bernstein run
run_archive.py Export full run archive as ZIP
run_bootstrap.py Main Click commands and execution bootstrap for Bernstein runs
run_cmd.py Run commands: init, conduct, downbeat (legacy start), and the main CLI group
run_confirm.py Recipe/cook commands, demo, and confirmation helpers for Bernstein runs
run_names.py Memorable deterministic run names for user-facing surfaces (#1626)
run_preflight.py Preflight cost estimation and runtime warnings for Bernstein runs
settings_snapshot.py Settings snapshot - capture and serialize effective settings for traces
side_query.py Side-question ("btw") protocol for non-blocking agent queries
status.py Formatted status output for bernstein status
transcript_search.py Search across agent transcript/trace files in .sdd/traces/
ui.py Shared Rich UI components for Bernstein CLI
usage_provisioning.py Usage budget tracking and overage detection for Bernstein
commands/ commands sub-package
display/ ui sub-package
doctor/ Bernstein doctor sub-package - extended diagnostic checks
plan/ plan sub-package
scaffold/ Bernstein scaffold subsystem
utils/ utils sub-package

src/bernstein/evolution/ - self-evolution engine

File Purpose
_shared.py Shared constants, data classes, and helpers for the evolution loop modules
aggregator.py Metrics aggregation with EWMA, CUSUM, BOCPD, and Goodhart defenses
applicator.py Change applicator - execute upgrades via file modification
benchmark.py Tiered benchmark runner for evolution validation
circuit.py CircuitBreaker - halt evolution when safety conditions are violated
creative.py Creative evolution pipeline - visionary → analyst → production gate
cycle_helpers.py Evolution cycle helper logic - community, creative, and GitHub sync
cycle_runner.py Evolution cycle execution engine
data_collector.py Metric record types and file-based metrics collection for the evolution system
detector.py Opportunity detection from aggregated metrics
gate.py ApprovalGate and EvalGate - risk-stratified routing for evolution proposals
governance.py Adaptive governance for the evolution system
invariants.py InvariantsGuard - hash-lock safety-critical files
loop.py Autoresearch evolution loop - continuous self-improvement via experiment cycles
oscillation_guard.py Oscillation guard for prompt-evolution proposals
predicted_delta.py Predicted-delta gate for prompt-evolution proposals
proposal_scorer.py Proposal risk scoring and routing classification
proposals.py Upgrade proposal generation
report.py Evolution observability - history table and static report generation
report_generator.py Analysis result types, statistical helpers, and Goodhart's Law defenses
risk.py Strategic Risk Score (SRS) computation for evolution proposals
sandbox.py SandboxValidator - isolated testing of evolution proposals
types.py Shared types for the evolution system

src/bernstein/eval/ - evaluation harness

File Purpose
ab_runner.py A/B runner primitive - deterministic prompt-vs-prompt comparison
baseline.py Baseline tracking for eval-gated evolution
calibration.py Calibration log + Brier score for router and judge decisions
golden.py Golden benchmark suite - curated tasks for eval
harness.py Eval harness - multiplicative scoring, LLM judge, failure taxonomy
incident_synthesizer.py Convert dead-letter and post-mortem incidents into regression eval cases
judge.py LLM judge - evaluate code quality of agent-produced changes
metrics.py Custom eval metrics - each metric is a dataclass with a compute method
pentest_consensus.py Consensus aggregation for the multi-adapter pentest fan-out
pentest_runner.py Driver that runs the security-pentest scenario end-to-end
pentest_scorer.py Precision and recall scorer for security pentest eval scenario
scenario_generator.py Forward-looking synthetic scenario generator
scenario_runner.py Scenario runner - execute YAML-defined eval scenarios against the live codebase
taxonomy.py Failure taxonomy - classify every eval failure into a closed set
telemetry.py Telemetry contract - strict schema for agent output metadata
vcr_fixture.py VCR fixture pattern - dehydrate/hydrate deterministic test fixtures (T805)
yaml_runner.py YAML eval harness - operator-runnable spec format with judge and golden diff
golden_data/ Packaged golden benchmark fixtures (ships in wheel via package-data)

src/bernstein/plugins/ - plugin system (pluggy)

File Purpose
hookspecs.py Hook specifications - defines extension points for Bernstein plugins
manager.py Plugin manager - discovers, loads, and invokes Bernstein plugins
permission_explain.py Progressive disclosure for permission requests -- explain-before-approve mode
plugin_errors.py Plugin error collection and reporting
plugin_trust.py Plugin trust checking and risk scoring for Bernstein plugins
security_review.py Security review: regex-based pattern scanning for agent-produced diffs

src/bernstein/tui/ - Textual TUI

File Purpose
accessibility.py TUI-013: Accessibility mode for the Bernstein TUI
activity_tracker.py Activity tracking metrics with duration accounting
agent_duration.py Agent duration display for TUI agent panel
agent_log.py Log and quality gate widgets for the Bernstein TUI
agent_states.py TUI-005: Visual distinction for agent states
app.py Main Textual application for the Bernstein TUI session manager
approval_panel.py Approval, waterfall, tool observer, and SLO widgets for the Bernstein TUI
away_summary.py Away summary generation for orchestration recap
clipboard.py TUI-007: Copy-to-clipboard for task IDs, agent logs, error messages
color_mode.py Terminal color mode detection - auto-detect truecolor/256-color/ANSI
command_palette.py Command palette with fuzzy search for TUI actions
context_files_doctor.py Context warnings for bernstein doctor -- detect stale configs, bad files, MCP issues
context_tokens.py Context token accounting - analyze where context budget is spent
contextual_tips.py Contextual tips system with cooldown for Bernstein CLI
cost_sparkline.py TUI-006: Cost sparkline for the TUI sidebar
dependency_graph.py ASCII-art dependency graph renderer for tasks
diff_folding.py Diff folding - collapsible diff display for large changes
diff_render.py Word-level diff rendering for Bernstein TUI
fallback.py Rich-based fallback display for unsupported terminals (TUI-003)
help_screen.py Help screen modal for TUI - shortcuts plus discoverability hints
keybinding_config.py TUI-004: Configurable keybinding system with YAML-based key map
layout_cache.py Dirty-flag layout caching - skip layout recalculation when nothing changed
layout_persistence.py Persistent layout customization and presets for the Bernstein TUI
log_viewer.py Syntax highlighting, diff folding, and markdown rendering for agent logs
mouse_support.py TUI-017: Mouse support for panel interaction
notification_badge.py TUI-020: Notification badge for background events
output_styles.py Output style customization -- load per-project agent output format preferences
progress_bar.py TUI-010: Task progress bar with completion percentage
scheduling_panel.py Scheduling panel widget for TUI - visualizes scheduling decisions
session_analytics.py Session analytics - analyze task traces for insights
session_recorder.py TUI-019: Session recording and playback
session_rename.py Session rename functionality - rename the current session safely
session_tags.py Session tag system - add searchable tags to the current orchestration session
snapshot_testing.py TUI visual regression snapshot testing utilities
split_pane.py TUI-008: Split-pane view (tasks list + live agent log)
status_bar.py Status, scratchpad, and coordinator widgets for the Bernstein TUI
task_context.py Task context panel for the Bernstein TUI
task_detail_overlay.py TUI-018 / UX-006: Task detail overlay with tabbed sections
task_list.py Task display widgets and constants for the Bernstein TUI
task_search.py Task search, filter parsing, and input widget for the TUI
themes.py TUI-011: Dark/light theme support
timeline.py Timeline view of task execution for the Bernstein TUI
toast.py TUI-009: Notification toast for events
tokens.py Per-agent token usage tracker - lightweight thread-safe token accounting
vim_mode.py TUI-014: Vim-mode keybindings for TUI navigation
widgets.py Custom Textual widgets for the Bernstein TUI
worker_badges.py Worker badge identity module - format worker metadata into Rich badge strings
worktree_status.py Compact runtime and worktree health pane for the Bernstein TUI

src/bernstein/github_app/ - GitHub App integration

File Purpose
app.py GitHub App authentication: JWT creation and installation token exchange
check_runs.py GitHub Check Runs API client
ci_router.py CI failure routing: blame attribution and enriched fix-task generation
cost_reporter.py PR cost annotation: post agent run cost summaries as GitHub PR comments
mapper.py Event-to-task conversion: maps GitHub webhook events to Bernstein task payloads
slash_commands.py Slash command parser for /bernstein comments on GitHub issues and PRs
webhooks.py Webhook parsing and HMAC-SHA256 signature verification

src/bernstein/mcp/ - MCP server

File Purpose
capability.py Runtime capability cards for the Bernstein MCP server
cost_meter.py Per-call cost-meter and observability envelope for MCP tool responses
input_validation.py Schema-validated MCP tool-call inputs with deny-by-default
oauth.py OAuth-2 / OIDC discovery metadata for the Bernstein MCP server
prompts.py Built-in MCP prompts for the Bernstein server
remote_transport.py Streamable HTTP transport for Bernstein MCP server
routine_tools.py MCP tools for the rt-003 Routine <-> Scenario bridge
server.py Bernstein MCP server
streaming.py In-flight tool-call tracking with cancellation and partial-result preservation
resources/ MCP resource registrars for Bernstein
tool_schemas/ tool_schemas/ sub-package

src/bernstein/benchmark/ - SWE-bench

File Purpose
comparative.py Comparative Benchmark Suite for Bernstein
golden.py Golden test suite for the Bernstein orchestrator
programbench.py ProgramBench evaluation harness for Bernstein
reproducible.py Reproducible benchmark suite for Bernstein
swe_bench.py SWE-Bench evaluation harness for Bernstein

Key non-package directories

Path Purpose
templates/roles/ Jinja2 role prompts (manager, backend, qa, security, devops, etc.)
templates/prompts/ Prompt templates (judge.md, etc.) - bundled into wheel
.sdd/ All runtime state (never commit .sdd/runtime/)
.sdd/backlog/open/ YAML task specs waiting to be picked up
.sdd/backlog/claimed/ Tasks currently being worked
.sdd/backlog/closed/ Completed/cancelled tasks
.sdd/runtime/ PIDs, logs, session state, signal files
.sdd/metrics/ JSONL metric records
.sdd/traces/ JSONL agent traces
.sdd/agents/catalog.json Registered agent catalog
tests/unit/ Fast unit tests (no network)
tests/integration/ Integration tests (require running server)
scripts/run_tests.py Per-file isolated test runner

Build & test

pip install -e .[dev]      # install
pytest                     # tests
uv run ruff check .        # lint
uv run ruff format .       # format
uv run mypy src            # type-check

Setup

  1. uv sync to install + lock the project.
  2. uv run python -m bernstein --help (or your equivalent entry point).
  3. See Build & test for the recurring commands.

Architecture (entry points)

Top-level entry points exposed by the package:

Command Purpose
bernstein bernstein.cli.main:cli
bernstein-worker bernstein.core.worker:main

Back-compat aliases. Legacy import paths (e.g. bernstein.core.orchestrator) are served by a sys.meta_path finder, not by physical shim files. The mechanism lives in src/bernstein/core/__init__.py as _CoreRedirectFinder driven by the _REDIRECT_MAP dict - add new aliases there rather than creating shim modules at the old path.

Git workflow

Default branch: main.

Agent roles

Bernstein ships agent role prompts under templates/roles/. The orchestrator loads them at task-spawn time; you don't write to them manually.

  • adversary
  • analyst
  • architect
  • backend
  • ci-fixer
  • devops
  • docs
  • frontend
  • manager
  • ml-engineer
  • prompt-engineer
  • qa
  • resolver
  • retrieval
  • reviewer
  • security
  • visionary
  • vp

Documentation duty

Every PR that adds or changes a feature MUST update docs in the same PR:

  • User-visible behaviour: update the relevant README.md section.
  • Operator workflows: update docs/operations/<area>.md.
  • Public API surface: regenerate docs/api/ schemas.
  • Architecture or new module: update docs/sdd/ and run bernstein agents-md sync so AGENTS.md, CLAUDE.md, .goosehints, CONVENTIONS.md, and .cursor/rules/*.mdc stay aligned.
  • New test layer: also update docs/contributing/testing.md.

PRs without the matching docs change will be sent back. Docs and code ship together.