Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ironbee-ai/ironbee-cli

Repository files navigation

IronBee
IronBee CLI

The CLI for IronBee — Verification and Intelligence Layer for Agentic Development

npm version license: Elastic License 2.0 CI


IronBee ensures that AI agents verify their code changes before completing a task. When an agent edits code, it cannot finish until it exercises the affected paths through real tools — in the browser for frontend changes, against the wire protocol (HTTP / gRPC / GraphQL / WebSocket) for any-runtime backend changes, or via the Node.js V8 inspector for Node-specific backend changes — and submits a passing verdict.

No more "it should work" — every change is tested.

IronBee also tracks every verification cycle — coding time, fix time, pass/fail rates, problematic files — and provides session and project-level analytics for LLM-powered semantic insights.

Powered by IronBee DevTools (@ironbee-ai/devtools), which runs in three modes from the same package:

  • Browser mode (bdt_* tools, default-on): the agent navigates pages, clicks buttons, fills forms, takes screenshots, checks console errors.
  • Backend mode (bedt_* tools, opt-in, runtime-agnostic): the agent drives real HTTP / gRPC / GraphQL / WebSocket calls against your backend, inspects logs, and queries databases — works for Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, and Scala backends alike.
  • Node mode (ndt_* tools, opt-in): the agent connects to a running Node process, sets V8 probes (tracepoint / logpoint / exceptionpoint) at the changed code paths, exercises them, and reads back snapshots or runtime logs.

A single Stop hook can drive multiple cycles in parallel — touching frontend, a backend protocol, and a Node runtime in the same change requires evidence for each before the task can complete.

Demo

IronBee.CLI.-.Demo.30sec.Compressed.mp4

Supported Clients

Client Status
Claude Code Supported
Cursor Supported
Codex Planned
OpenCode Planned

Quick Start

Install IronBee globally

npm install -g @ironbee-ai/cli

Set up a project

cd your-project
ironbee install

This auto-detects your AI client and writes:

  • Hook configuration (so the client calls IronBee automatically)
  • Verification skill/rules (so the agent knows the workflow — covers every enabled cycle)
  • MCP server entries from the same @ironbee-ai/devtools package (IronBee DevTools), per-cycle gated — only currently enabled cycles get an entry:
    • browser-devtools (PLATFORM=browser, bdt_ prefix) — registered on first install (browser is the default-on cycle); strip with ironbee browser disable
    • backend-devtools (PLATFORM=backend, bedt_ prefix) — only after ironbee backend enable
    • node-devtools (PLATFORM=node, ndt_ prefix) — only after ironbee node enable
  • Permissions matching the registered entries (mcp__browser-devtools__*, plus mcp__backend-devtools__* and/or mcp__node-devtools__* once their cycles are enabled)

Optional: backfill historical sessions

Already have weeks of Claude Code sessions on disk? ironbee import walks them and ships every session / activity / tool_call / file_change / analytics event to the IronBee Collector — so your dashboard fills with historical context the moment you finish installing. Already-tracked sessions (live or previously imported) are skipped automatically; pass --force to re-import.

Typical three-step flow:

# 1. Preview — zero POSTs, shows exact cost and event counts
ironbee import --since 30d --dry-run

# 2. Confirm and ship (interactive y/N prompt by default)
ironbee import --since 30d

# 3. Optional: cast a wider net later
ironbee import --all-projects --since 6m --concurrency 2

--dry-run always shows the exact cost_usd that will surface in your dashboard before you confirm — $342.18 is much less surprising when you know it's coming.

Common scenarios

Scenario Command
Onboarding — current project, last 30 days ironbee import --since 30d
Current project, full history ironbee import
One specific project from anywhere ironbee import --projects /path/to/repo
Multiple projects ironbee import --projects /repos/auth,/repos/payments
Every project on this machine ironbee import --all-projects --since 6m
Explicit date range (e.g. Q1 retrospective) ironbee import --all-projects --from 2025-01-01 --to 2025-03-31
Single transcript file (debug / cherry-pick) ironbee import --transcript ~/.claude/projects/-Users-me-foo/abc.jsonl
CI / scripted onboarding (no prompt) ironbee import --since 60d --yes
Tune backend load ironbee import --since 6m --concurrency 2 (or 16 for fast pipes)
Force re-import a single session ironbee import --transcript path.jsonl --force --yes

Flag groups

Scope (mutually exclusive — pick at most one; default is the current directory):

  • --transcript <path> — single .jsonl file
  • --projects <p1,p2,...> — comma-separated absolute project paths
  • --all-projects — every directory under ~/.claude/projects/

Time range (mutually exclusive; default is no filter):

  • --since <duration>30d, 2w, 6m, 12h (relative to now)
  • --from <iso-date> [--to <iso-date>] — explicit window; --to defaults to now

Behavior:

  • --dry-run — print summary, make zero POSTs, exit 0
  • --yes — skip the confirm prompt
  • --force — bypass the "already tracked" skip rule
  • --concurrency <N> — parallel sessions (default 4, clamped to [1, 32]); also configurable via import.concurrency in ~/.ironbee/config.json or <project>/.ironbee/config.json

Optional: opt out of the browser cycle

ironbee browser disable

The browser cycle is the default-on cycle — every code-file edit (40+ extensions: .ts, .tsx, .css, .html, .py, .go, .java, …) requires browser-driven verification (navigate / screenshot / aria / console). Run browser disable for projects where you don't want browser-cycle enforcement (e.g. backend-only services where only backend enable / node enable apply). It writes browser.verifyPatterns: [] to override the legacy 40+ extension default; customizations of alwaysRequired / evidencePaths / additionalVerifyPatterns are preserved.

To re-enable: ironbee browser enable — strips the verifyPatterns: [] override so the code defaults (legacy 40+ extension list) flow back in at runtime. config.json stays minimal; the default list is NOT materialized into the file (it lives in code and tracks the CLI version automatically).

Optional: enable runtime-agnostic backend protocol verification

ironbee backend enable

Activates the backend protocol cycle — drives real HTTP / gRPC / GraphQL / WebSocket calls against your running backend service via the backend-devtools MCP (bedt_* tools) and verifies the responses. Works for any backend runtime: Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala. The command writes a minimal { "backend": {} } block to config — code defaults (multi-language paths covering server/**, api/**, routes/**, controllers/**, handlers/**, services/**) flow in at runtime.

To revert: ironbee backend disable (drops the block clean if no customizations / lower-layer override; otherwise hard-kills via verifyPatterns: []).

Optional: enable Node.js runtime debug verification

ironbee node enable

Run this once per project whose backend is Node.js and you want IronBee to gate at the runtime level (V8 inspector probes via node-devtools). It writes a minimal { "node": {} } block to config — code defaults (e.g. server/**, pages/api/**, **/server.{ts,js,mjs,cjs}) flow in at runtime; nothing is materialized into the file. From then on, edits to matching paths require Node-cycle verification (connect + probes/logs) alongside any browser-cycle verification. To customize, set node.verifyPatterns (replaces defaults) or node.additionalVerifyPatterns (appends).

The node cycle is independent of the backend cycle — backend drives the wire protocol from outside, while node attaches to a Node.js process and sets non-blocking debug probes. Both can be enabled simultaneously; both must pass.

To revert: ironbee node disable. With no customizations the entire node block is dropped (clean config). With customizations or a lower-layer override, writes verifyPatterns: [] (hard kill, preserves alwaysRequired / evidencePaths / additionalVerifyPatterns so re-enabling later restores your tuned setup).

Optional: monitoring-only mode (no enforcement)

ironbee verification disable

Turns off enforcement but keeps the telemetry path intact. Session lifecycle and tool-call events still flow to the IronBee Collector, but the agent never sees a verify-gate, skill, rule, or /ironbee-verify command — useful when you want observability without slowing the agent down. To re-enable: ironbee verification enable.

The toggle re-renders all client artifacts (hooks, skill, rule, MCP servers, permissions) atomically. The change takes effect on the next agent session — restart your editor / agent after toggling.

Cursor: additional setup

Cursor requires manual activation of MCP servers after install:

  1. Restart Cursor to load the new hooks and MCP config
  2. Go to Settings → Tools & MCP and verify each registered IronBee server is enabled — browser-devtools is always present on a default install; backend-devtools appears after ironbee backend enable; node-devtools appears after ironbee node enable
  3. If a server shows as enabled but tools are unavailable, toggle it off and on

Note: This is a known Cursor limitation — MCP servers added via mcp.json may need manual activation.

That's it

The next time your AI agent edits code, IronBee will require verification before the task can complete — browser cycle for frontend changes, backend cycle for runtime-agnostic protocol calls (if enabled), Node cycle for Node.js runtime debug (if enabled), or any combination in parallel.

Commands

ironbee install [project-dir] [--client <name>] [--all]   Set up hooks and config; --all → batch across every registered project
ironbee uninstall [project-dir] [--client <name>] [--all] [-y]   Remove hooks and config; --all → batch wipe across every registered project (destructive, prompts unless --yes)
ironbee update                                    Update IronBee CLI to the latest version (npm self-update)
ironbee status [project-dir]                      Show verdict status for active sessions
ironbee verify [session-id]                       Dry-run verdict validation
ironbee analyze [session-id]                      Analyze session metrics (or all sessions)
ironbee import [options]                          Backfill historical Claude sessions to the IronBee Collector (--since / --from / --to, --transcript / --projects / --all-projects, --dry-run, --yes, --force, --concurrency)
ironbee browser <enable|disable>   [-g|--local] [--client <name>]   Manage the browser cycle (default-on; bdt_* tools via browser-devtools)
ironbee backend <enable|disable>   [-g|--local] [--client <name>]   Manage the runtime-agnostic backend protocol cycle (HTTP/gRPC/GraphQL/WS via backend-devtools)
ironbee node <enable|disable>      [-g|--local] [--client <name>]   Manage the Node.js runtime debug cycle (V8 inspector probes via node-devtools)
ironbee verification <enable|disable> [-g|--local] [--client <name>]   Master verification toggle (enable = enforce; disable = monitoring-only, no enforcement but sessions/tools still ship to collector)
ironbee config get   <key>         [-g|--project|--local]   Read a config value (default: merged effective value; flags narrow to one of the three layers)
ironbee config set   <key> <value> [-g|--local] [--client <name>] [--no-rerender] [--json] [--apply-all|--no-apply-all]   Write a config value; auto re-renders client artifacts on artifact-affecting keys; -g writes global, --local writes project-local (gitignored)
ironbee config unset <key>         [-g|--local] [--client <name>] [--no-rerender] [--apply-all|--no-apply-all]   Remove a config value (idempotent); same target / rerender rules as set
ironbee config list                [-g|--project|--local]   Print the entire config (merged / global / project / local)
ironbee config path                [-g|--local]   Print the on-disk path of the targeted config file (project default; -g for global, --local for project-local)
ironbee register   [-p <dir>] [--client <name>]   Add this project to the user-home inventory (no artifact writes)
ironbee unregister [-p <dir>] [--client <name>]   Remove this project from the user-home inventory (no artifact writes)
ironbee queue status [--session <id>]             Queue status per session (counts, recent dead-letter errors)
ironbee queue drain  [--session <id>]             Synchronously drain pending snapshots
ironbee queue dead-letter list|stats|retry|clear  Inspect / retry / clear dead-letter entries

Projects inventory

ironbee install records each project it touches in ~/.ironbee/projects.json; ironbee uninstall removes it. The inventory powers two cross-project workflows:

  • ironbee install --all — explicit batch op that re-runs install on every registered project. Use after a global config change to propagate it everywhere; uses each project's currently detected clients (or pass --client <name> to override).
  • ironbee uninstall --all — destructive batch op that wipes ironbee from every registered project. Prompts with default-No before acting; pass --yes / -y to skip the prompt. Refuses without --yes in non-interactive contexts.
  • Prompt on global config writesironbee config set <key> <val> -g (and unset) on an artifact-affecting key (collector, verification, browser, backend, node, browserDevTools, backendDevTools, nodeDevTools) lists up to 10 other registered project paths still on the prior state and asks Apply this change to these N projects now? [Y/n] (default Yes). Pass --apply-all / --no-apply-all to skip the prompt; non-TTY contexts skip it and print a hint pointing at install --all.

For pure inventory bookkeeping (no artifact writes):

  • ironbee register — adds the current project to the inventory. Useful for projects set up before this feature existed.
  • ironbee unregister — removes the current project from the inventory. Works on already-deleted project dirs.

Agent Commands (slash commands)

IronBee installs slash commands that the agent can use inside Claude Code or Cursor:

Command Description
/ironbee-verify Verify changes — focused on affected areas (default)
/ironbee-verify full Full verification — complete visual + functional + accessibility checklists
/ironbee-verify visual Visual-only — contrast, layout, spacing, fonts, images, theming
/ironbee-verify functional Functional-only — clicks, forms, navigation, data flow, error handling
/ironbee-analyze Run session analytics and provide LLM-powered semantic insights

/ironbee-verify guides the agent through a systematic verification process. The default mode focuses on what changed, while full runs every checklist item. Use visual or functional to narrow the scope when you know what type of testing is needed.

Configuration

IronBee loads config from three layers and deep-merges them in order (each later layer overrides the earlier ones), then layers env-var overrides on top:

  1. Global~/.ironbee/config.json
  2. Project<project>/.ironbee/config.json (committed; team-shared)
  3. Project-local<project>/.ironbee/config.local.json (gitignored; per-machine / per-developer override)
  4. Env-var overrides — selected IRONBEE_* env vars (e.g. IRONBEE_API_KEYcollector.apiKey); env always wins over every file layer. See Env-var overrides below.

The local layer is optional — ironbee install adds .ironbee/config.local.json to .gitignore automatically, but the file is only created when you actually write to it (e.g. ironbee config set ... --local).

{
  "ignoredVerifyPatterns": ["*.test.ts", "*.spec.ts"],
  "maxRetries": 5,

  "browser": {
    "verifyPatterns": ["*.ts", "*.tsx", "*.css"],
    "additionalVerifyPatterns": ["*.mdx"]
  },

  "backend": {
    "verifyPatterns": ["routes/**/*.{go,py,java,ts}", "controllers/**/*.{go,py,java}"]
  },

  "node": {
    "verifyPatterns": ["server/**/*.ts", "pages/api/**/*.ts"]
  },

  "verification": {
    "enable": false
  },

  "fileChange": {
    "captureChangeset": true
  }
}
Key Description Default
browser.verifyPatterns Glob patterns for files requiring browser verification (replaces defaults). Four-state semantic: block-absent → code defaults (40+ ext, default-on); block-present + verifyPatterns unset → code defaults (post-browser enable shape); [] → hard kill (also disables additionalVerifyPatterns); custom [...] → user-defined. 40+ code extensions when block absent OR verifyPatterns unset
browser.additionalVerifyPatterns Extra browser patterns appended to defaults []
backend.verifyPatterns Glob patterns activating the runtime-agnostic backend protocol cycle (backend-devtools MCP, bedt_* tools — HTTP / gRPC / GraphQL / WebSocket). Same four-state semantic, default-off: block absent → cycle disabled; block present + verifyPatterns unset → 13 default patterns from code (multi-language: routes/**, controllers/**, handlers/**, services/** across .ts/.js/.py/.go/.java/.rb/.cs/.rs/.kt/.scala/.ex/.exs/.php/.clj); [] → hard kill; custom [...] → user-defined. Opt in via ironbee backend enable. block absent → disabled; block present + unset → 13 code defaults
backend.additionalVerifyPatterns Extra patterns appended to backend.verifyPatterns (or to code defaults when verifyPatterns is unset). Ignored when verifyPatterns: []. []
backend.alwaysRequired Backend-cycle required tools (all-of). Empty default — backend uses any-of evidence paths. []
backend.evidencePaths Alternative tool paths — at least one must be fully satisfied. Defaults: protocol-call (any bedt_request_*) OR log-evidence (bedt_log_register-source AND any read/follow) OR db-evidence (bedt_db_connect AND any inspect tool). protocol-call OR log-evidence OR db-evidence
node.verifyPatterns Glob patterns activating the Node.js runtime debug cycle (node-devtools MCP, ndt_* tools — V8 inspector probes). Same four-state semantic as browser.verifyPatterns, but default-off: block absent → cycle disabled; block present + verifyPatterns unset → 9 default patterns from code (server/**, pages/api/**, **/server.{ts,js,mjs,cjs}, …); [] → hard kill; custom [...] → user-defined. Opt in via ironbee node enable. block absent → disabled; block present + unset → 9 code defaults
node.additionalVerifyPatterns Extra patterns appended to node.verifyPatterns (or to code defaults when verifyPatterns is unset). Ignored when verifyPatterns: []. []
node.alwaysRequired Node-cycle required tools (all-of) ["ndt_debug_connect"]
node.evidencePaths Alternative tool paths — at least one must be fully satisfied probe path + log path
ignoredVerifyPatterns Patterns to exclude from verification (checked first, applies to all cycles) []
maxRetries Max retry attempts before allowing completion (single global counter regardless of how many cycles run) 3
verification.enable Master switch for enforcement. Inverse semantics from recording/jobQueue/collector — verification is the core feature, opt-out via enable: false. When disabled, ironbee runs in monitoring-only mode (no enforcement hooks, skill, rule, or MCP servers; only session/activity/tool_call telemetry flows to the collector). true
fileChange.captureChangeset When true, every file_change event carries a hunks-only unified-diff changeset string (@@ headers + space/-/+ lines, no filename header — file_path already lives on the parent event). Off by default — the default tool_input whitelist deliberately strips file content from the wire; turning this on routes content through file_change instead. PreToolUse pre-reads the file when enabled so PostToolUse can produce a real before/after diff (Write/Edit on Claude; Write/StrReplace/Delete on Cursor). Skipped on binary content (NUL byte in first 4 KB). false
fileChange.maxChangesetBytes Hard cap on the changeset string size. Diffs over the cap are sliced on a UTF-8 byte boundary and end with a \n... (truncated, N bytes omitted)\n footer so the collector POST stays within typical reverse-proxy body limits. 65536 (64 KB)

Editing config from the CLI (ironbee config)

You can edit any of the three config layers via the CLI instead of hand-rolling JSON:

# Read the effective (merged) value across all three layers
ironbee config get collector.url

# Write to project config (default — committed, team-shared)
ironbee config set collector.url https://collector.example.com
ironbee config set maxRetries 5
ironbee config set verification.enable false
ironbee config set browser.verifyPatterns '["*.ts", "*.tsx", "*.css"]'

# Write to global config (~/.ironbee/config.json)
ironbee config set collector.apiKey sk-... --global

# Write to project-local config (<project>/.ironbee/config.local.json — gitignored, per-machine)
ironbee config set collector.url http://localhost:4000 --local

# Remove a value (idempotent — no-op if absent)
ironbee config unset collector.url            # project layer
ironbee config unset collector.url --local    # local layer

# Inspect (default reads merged effective; flags narrow to a single layer)
ironbee config list                # merged effective config across all three layers
ironbee config list --global       # global file only
ironbee config list --project      # project file only
ironbee config list --local        # project-local file only
ironbee config path                # print the project config file path
ironbee config path --local        # print the project-local config file path

Target flags are mutually exclusive: pass at most one of -g/--global, --project (read-only — --project is the default for writes), or --local.

Type coercionset parses the value as JSON when it can (true/42/[…]/{…}) and falls back to a raw string when JSON parse fails. URLs and paths pass through unquoted; pass --json to force strict JSON parsing (e.g. when you want the literal string "42" instead of the number 42).

Smart artifact re-render — when a top-level key affects installed client artifacts (verification, collector, browser, backend, node, browserDevTools, backendDevTools, nodeDevTools), set and unset re-render the client files (hooks, MCP entries, skill, rule, permissions) automatically — same code path verification enable / backend enable / node enable use. Other keys (maxRetries, recording, jobQueue, analytics, import, ignoredVerifyPatterns) are pure config flips that the next agent session picks up — no rerender needed.

Pass --no-rerender to skip the rerender on artifact-affecting keys (handy for scripted bulk edits — follow up with ironbee install to resync). If a rerender fails midway, the config file is rolled back to its prior bytes so disk state never diverges from installed artifacts.

Restart your editor / agent session after changing artifact-affecting keys — the host caches hook config at session start, so the new state takes effect on the next run.

Env-var overrides

A small allowlist of IRONBEE_* env vars overrides specific config paths on top of the three file layers. Useful for secrets that shouldn't be committed (CI runners, ephemeral shells, multi-env desktop setups). Set to a non-empty string to override; unset or empty-string falls back to the file value. Env always wins over every file layer.

Env var Config path Notes
IRONBEE_API_KEY collector.apiKey Lets CI / per-shell setups supply the collector API key without committing it. Combined with a file-set collector.url, the merged effective config has both required fields.
# Use a one-shot key for this shell only
export IRONBEE_API_KEY=sk-...
ironbee config get collector.apiKey         # returns the env value (merged read)
ironbee config get collector.apiKey --project   # returns only what's in the project file (env bypassed)

Layer-specific reads (--global / --project / --local) bypass env overrides and show only what's on disk in that layer. The default merged read surfaces the env value when set, so it always reflects what the runtime will actually use.

ironbee config set / unset warn when the targeted path is shadowed by a live env override — the file write still succeeds, but the operator's value won't take effect until the env var is unset.

Default verify patterns

By default, the browser cycle is enabled and matches common code file extensions: .ts, .tsx, .js, .jsx, .css, .scss, .html, .py, .go, .rs, .java, .vue, .svelte, and many more (DEFAULT_BROWSER_VERIFY_PATTERNS). Backend file edits trigger browser verification by default since they often affect frontend behavior. Run ironbee browser disable for projects where the browser-cycle gate isn't appropriate (e.g. backend-only services); ironbee browser enable re-enables.

Patterns are NOT materialized into config.json — they live in the CLI source (DEFAULT_BROWSER_VERIFY_PATTERNS / DEFAULT_BACKEND_VERIFY_PATTERNS / DEFAULT_NODE_VERIFY_PATTERNS) and flow in at runtime when the cycle block exists without an explicit verifyPatterns key. Keeps config.json minimal AND lets defaults track CLI updates automatically (no frozen-at-install-time drift). To customize, set the explicit <cycle>.verifyPatterns (replaces defaults) or <cycle>.additionalVerifyPatterns (appends).

The backend cycle is opt-in via ironbee backend enable and is runtime-agnostic (drives wire protocols via backend-devtools). The node cycle is opt-in via ironbee node enable (only meaningful for Node.js backends — node-devtools is a V8 inspector wrapper).

Non-code files like README.md, package.json, or .gitignore do not trigger any cycle.

Devtools MCP server config

IronBee can register up to three MCP server entries from the same @ironbee-ai/devtools package (IronBee DevTools) — browser-devtools (bdt_ prefix, browser mode), backend-devtools (bedt_ prefix, runtime-agnostic backend mode), and node-devtools (ndt_ prefix, Node mode). Each is per-cycle gated (only enabled cycles get an entry) and can be customized independently via its own config block.

For the browser server, use browserDevTools:

{
  "browserDevTools": {
    "mcp": {
      "url": "http://localhost:4000/mcp"
    }
  }
}

For the backend server, use backendDevTools:

{
  "backendDevTools": {
    "env": { "BACKEND_DEFAULT_HOST": "http://localhost:8080" }
  }
}

For the node server, use nodeDevTools:

{
  "nodeDevTools": {
    "env": { "NODE_INSPECTOR_HOST": "127.0.0.1" }
  }
}

You can mix-and-match: full config replacement via mcp, or just env-var additions via env. The two blocks below combine — one uses mcp for full replacement on the browser server, the other adds env vars to the backend server:

{
  "browserDevTools": {
    "mcp": {
      "command": "node",
      "args": ["./my-server.js"],
      "env": { "MY_VAR": "value" }
    }
  },
  "backendDevTools": {
    "env": { "OTEL_ENABLE": "true" }
  }
}
Key Description
browserDevTools.mcp / backendDevTools.mcp / nodeDevTools.mcp Full MCP server config — used as-is when provided. Supports command+args (stdio) or url (HTTP)
browserDevTools.env / backendDevTools.env / nodeDevTools.env Extra env vars merged into the default config. Only used when mcp is not provided

Note: IronBee always sets TOOL_NAME_PREFIX (bdt_ / bedt_ / ndt_), TOOL_INPUT_METADATA_ENABLE=true, and PLATFORM (browser / backend / node) — these cannot be overridden. When collector is configured, an OTEL exporter env block is also auto-injected on every server entry; operators can override individual OTEL_* keys via the env block above.

Verification Flow (multi-cycle)

When the agent tries to complete a task, IronBee runs these checks:

  1. Were code files edited? — If no matching files were changed, the agent completes normally.
  2. Which cycles are active? — IronBee matches each edited file against browser.verifyPatterns and (if you opted in) backend.verifyPatterns and/or node.verifyPatterns. A single file may activate two or three cycles; they all run in parallel and pass/fail combine with AND.
  3. Were the cycle's required tools used?
    • Browser cycle: navigate, screenshot, accessibility snapshot, console check (all-of)
    • Backend cycle: at least one evidence path must be fully exercised — protocol-call (any one of bedt_request_http / bedt_request_grpc / bedt_request_graphql / bedt_request_websocket-open / bedt_request_replay), OR log-evidence (bedt_log_register-source AND any one of bedt_log_read / bedt_log_read-multi / bedt_log_follow), OR db-evidence (bedt_db_connect AND any one of bedt_db_query / bedt_db_describe-table / bedt_db_list-tables / bedt_db_snapshot / bedt_db_diff / bedt_db_get-changes)
    • Node cycle: connect; then either probe path ((put-tracepoint | put-logpoint | put-exceptionpoint) AND get-probe-snapshots) OR log path (get-logs)
  4. Does a verdict exist? — The agent must submit a single verdict via ironbee hook submit-verdict.
  5. Is the verdict valid? — Required: status ∈ {pass, fail} + checks (non-empty array). On fail, issues is required; on pass-after-fail, fixes is required.
  6. Pass or fail? — Server-derived pass criteria from tool_call records is currently a no-op stub (TODO — see verify-gate.ts). For now status: "pass" is honored as-is. When evidence extractors land, per-cycle pass criteria (zero console errors, probe triggered, evidence path exercised) will be derived from the agent's tool_calls and override status: pass to fail when criteria don't hold.
  7. Retry limit — After maxRetries failed attempts (default 3, single global counter), the agent is allowed to complete but must report unresolved issues.

Verdict format

Verdicts are platform-agnostic — the same minimal shape regardless of which cycles (browser / backend / node / multi-cycle) ran. Structural evidence (pages tested, console error counts, endpoints called, log sources, DB connections, probe snapshots, …) is intentionally NOT part of the verdict — the gate (will) derive it from the tool_call records of your bdt_* / bedt_* / ndt_* invocations, so the agent cannot misreport it.

Submit via echo '<json>' | ironbee hook submit-verdict:

{
  "session_id": "<your-session-id>",
  "status": "pass",
  "checks": ["form submits successfully", "new item appears in list"]
}

On failure, include an issues array describing what went wrong:

{
  "session_id": "<your-session-id>",
  "status": "fail",
  "checks": ["form renders", "submit button unresponsive"],
  "issues": ["button click handler not firing", "TypeError in console"]
}

On pass after a previous fail, include a fixes array describing what was fixed:

{
  "session_id": "<your-session-id>",
  "status": "pass",
  "checks": ["form submits successfully", "new item appears in list"],
  "fixes": ["reattached click handler to submit button", "fixed TypeError in event handler"]
}

Multi-cycle (e.g. browser + backend + node all active in the same turn): same single verdict. Cycles are derived from the file_changes you made; pass criteria for each is derived from your tool_calls.

The agent must submit a verdict after every verification attempt — both pass and fail. File edits are blocked until a verdict is submitted after using devtools tools.

Session Isolation

Each AI session gets its own directory under .ironbee/sessions/<session-id>/:

.ironbee/sessions/<session-id>/
  actions.jsonl    # Event log (file edits, tool calls, verification markers)
  verdict.json     # Current verdict (cleared on code edit)
  state.json       # Session state (retries, activeVerificationId, activeTraceId,
                   #                lastVerdictStatus, activeFixId, activeActivityId,
                   #                phase, active, recordingRequired, recordingActive,
                   #                userEmail, usageType, usagePlan)
  session.log      # Debug log
  queue/           # File-backed job queue (jobs.jsonl, dead-letter.jsonl, worker.log)
  analytics/       # Per-session analytics state + analytics.log

This means parallel sessions (e.g., multiple Claude Code instances) don't interfere with each other.

Analytics

ironbee analyze provides metrics about verification sessions — how time is spent, how effective verifications are, and how confident we can be in the agent's code.

Usage

ironbee analyze <session-id>                    # single session analysis
ironbee analyze                                 # all sessions (project-level)
ironbee analyze --json                          # JSON output
ironbee analyze --detailed                      # include verdict details (checks, issues, fixes)
ironbee analyze --json --detailed               # JSON with verdict text for LLM semantic analysis
ironbee analyze <session-id> --json --detailed  # single session JSON with verdict details

The --detailed flag includes raw verdict text (checks, issues, fixes) in the output. This is designed for LLM-powered semantic analysis — use /ironbee-analyze in Claude Code or Cursor to have the agent interpret these details automatically.

Session Analysis

Phase Distribution

Each session is divided into three phases:

Phase What it measures
Coding Time from session start to first verification, and between fix end and next verification start
Verification Time between verification_start and verification_end — browser testing
Fixing Time between fix_start and fix_end — fixing failed verifications

Cycles

Metric Meaning
Verifications Number of verification cycles in the session
Fixes Number of fix cycles (each fail verdict starts a fix)
Avg verify Average duration of a verification cycle
Avg fix Average duration of a fix cycle
First verify Time from session start to first verification

Verification Quality

Metric Meaning
First-pass rate Percentage of verification chains where the first verdict was pass
Verdicts Total verdict count (pass + fail)
Avg retries Average number of fail verdicts before pass per chain
Avg checks Average number of checks performed per verdict

Per-cycle structural metrics (avg console errors, avg network failures, avg pages tested) are temporarily absent — they depended on agent-claimed evidence that has been removed from the verdict shape. They will return when verify-gate derives structural evidence from tool_call records (TODO).

Code Changes

Metric Meaning
Total edits Total file edit operations in the session
Unique files Number of distinct files edited
Avg per verify Average file edits before each verification
Avg per fix Average file edits during each fix cycle
Hot Files Top 5 most frequently edited files
Problematic Files Top 5 files with most edits during fix cycles
Edit Churn Files edited in 2+ separate fix cycles (root cause may not be resolved)

Fix Effectiveness

Metric Meaning
Success rate Percentage of fixes followed by a pass verdict
Re-fail rate Percentage of fixes followed by another fail verdict
Fix/verify Ratio of fix cycles to verification cycles (0 = no fixes needed)

Scoring

Three scores summarize the session:

Score Formula What it measures
Efficiency coding_time / (coding_time + fix_time) × 100 How much productive time vs fix overhead. High = minimal wasted time on fixes
Quality (pass_pct + checks_pct) / 2 How thorough the verification was. Components: pass rate, check depth (5+ checks = 100%). Page-coverage and error-cleanliness components were temporarily removed (depended on agent-claimed evidence) — they'll return when verify-gate derives structural evidence from tool_call records.
Confidence pass_count / total_verdicts × 100 How likely the agent's code works. Based on verdict pass rate

Project Analysis

When run without a session ID, ironbee analyze aggregates metrics across all sessions:

Metric Meaning
Session History Each session's summary — duration, cycles, outcome, score
Avg duration Average session duration across all sessions
Avg verifies Average verification cycles per session
Avg fixes Average fix cycles per session
First-pass rate Percentage of sessions where the first verdict was pass
Fix success rate Percentage of all fixes (across sessions) that succeeded
Abandon rate Percentage of sessions with interrupted verification/fix cycles
Avg efficiency Average efficiency score across all sessions
Avg confidence Average confidence score across all sessions
Problematic Files Top 5 files with most fix edits across all sessions

Telemetry

IronBee collects anonymous usage data to help improve the product. No source code, file contents, or personally identifiable information is ever sent.

Events collected: install/uninstall, session start, verdict submissions (pass/fail status only), and verification gate decisions.

To opt out, set the environment variable:

export IRONBEE_TELEMETRY=false

Or set telemetryEnabled: false in ~/.ironbee/telemetry.json.

Development

Requires Node.js ≥ 22 (Node 20 hit EOL on 2026-04-30).

npm install
npm run build       # tsc + scripts/copy-assets.js (mirrors .md/.mdc + assets/ to dist/)
npm run lint        # ESLint
npm run test        # Jest (unit + integration + client tests)
npm run dev         # Run via ts-node

CI runs the full test suite across linux × x64/arm64, darwin (Apple Silicon), and windows × x64/arm64 with Node 22 and 24. The build script is pure Node (no bash) so npm run build produces identical output on every OS.

License

Elastic License 2.0 (ELv2) — free to use, copy, modify, distribute, and embed in your own products. The only restriction is that you may not offer IronBee itself (or a substantially similar derivative) as a hosted or managed service to third parties.

Packages

 
 
 

Contributors

Languages