IronBee CLI

The CLI for IronBee — Verification and Intelligence Layer for Agentic Development

IronBee ensures that AI agents verify their code changes before completing a task. When an agent edits code, it cannot finish until it exercises the affected paths through real tools — in the browser for frontend changes, against the wire protocol (HTTP / gRPC / GraphQL / WebSocket) for any-runtime backend changes, or via the Node.js V8 inspector for Node-specific backend changes — and submits a passing verdict.

No more "it should work" — every change is tested.

IronBee also tracks every verification cycle — coding time, fix time, pass/fail rates, problematic files — and provides session and project-level analytics for LLM-powered semantic insights.

Powered by IronBee DevTools (@ironbee-ai/devtools), which runs in three modes from the same package:

Browser mode (bdt_* tools, default-on): the agent navigates pages, clicks buttons, fills forms, takes screenshots, checks console errors.
Backend mode (bedt_* tools, opt-in, runtime-agnostic): the agent drives real HTTP / gRPC / GraphQL / WebSocket calls against your backend, inspects logs, and queries databases — works for Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, and Scala backends alike.
Node mode (ndt_* tools, opt-in): the agent connects to a running Node process, sets V8 probes (tracepoint / logpoint / exceptionpoint) at the changed code paths, exercises them, and reads back snapshots or runtime logs.

A single Stop hook can drive multiple cycles in parallel — touching frontend, a backend protocol, and a Node runtime in the same change requires evidence for each before the task can complete.

Demo

IronBee.CLI.-.Demo.30sec.Compressed.mp4

Supported Clients

Client	Status
Claude Code	Supported
Cursor	Supported
Codex	Planned
OpenCode	Planned

Quick Start

Install IronBee globally

npm install -g @ironbee-ai/cli

Set up a project

cd your-project
ironbee install

This auto-detects your AI client and writes:

Hook configuration (so the client calls IronBee automatically)
Verification skill/rules (so the agent knows the workflow — covers every enabled cycle)
MCP server entries from the same @ironbee-ai/devtools package (IronBee DevTools), per-cycle gated — only currently enabled cycles get an entry:
- browser-devtools (PLATFORM=browser, bdt_ prefix) — registered on first install (browser is the default-on cycle); strip with ironbee browser disable
- backend-devtools (PLATFORM=backend, bedt_ prefix) — only after ironbee backend enable
- node-devtools (PLATFORM=node, ndt_ prefix) — only after ironbee node enable
Permissions matching the registered entries (mcp__browser-devtools__*, plus mcp__backend-devtools__* and/or mcp__node-devtools__* once their cycles are enabled)

Optional: backfill historical sessions

Already have weeks of Claude Code sessions on disk? ironbee import walks them and ships every session / activity / tool_call / file_change / analytics event to the IronBee Collector — so your dashboard fills with historical context the moment you finish installing. Already-tracked sessions (live or previously imported) are skipped automatically; pass --force to re-import.

Typical three-step flow:

# 1. Preview — zero POSTs, shows exact cost and event counts
ironbee import --since 30d --dry-run

# 2. Confirm and ship (interactive y/N prompt by default)
ironbee import --since 30d

# 3. Optional: cast a wider net later
ironbee import --all-projects --since 6m --concurrency 2

--dry-run always shows the exact cost_usd that will surface in your dashboard before you confirm — $342.18 is much less surprising when you know it's coming.

Common scenarios

Scenario	Command
Onboarding — current project, last 30 days	`ironbee import --since 30d`
Current project, full history	`ironbee import`
One specific project from anywhere	`ironbee import --projects /path/to/repo`
Multiple projects	`ironbee import --projects /repos/auth,/repos/payments`
Every project on this machine	`ironbee import --all-projects --since 6m`
Explicit date range (e.g. Q1 retrospective)	`ironbee import --all-projects --from 2025-01-01 --to 2025-03-31`
Single transcript file (debug / cherry-pick)	`ironbee import --transcript ~/.claude/projects/-Users-me-foo/abc.jsonl`
CI / scripted onboarding (no prompt)	`ironbee import --since 60d --yes`
Tune backend load	`ironbee import --since 6m --concurrency 2` (or `16` for fast pipes)
Force re-import a single session	`ironbee import --transcript path.jsonl --force --yes`

Flag groups

Scope (mutually exclusive — pick at most one; default is the current directory):

--transcript <path> — single .jsonl file
--projects <p1,p2,...> — comma-separated absolute project paths
--all-projects — every directory under ~/.claude/projects/

Time range (mutually exclusive; default is no filter):

--since <duration> — 30d, 2w, 6m, 12h (relative to now)
--from <iso-date> [--to <iso-date>] — explicit window; --to defaults to now

Behavior:

--dry-run — print summary, make zero POSTs, exit 0
--yes — skip the confirm prompt
--force — bypass the "already tracked" skip rule
--concurrency <N> — parallel sessions (default 4, clamped to [1, 32]); also configurable via import.concurrency in ~/.ironbee/config.json or <project>/.ironbee/config.json

Optional: opt out of the browser cycle

ironbee browser disable

The browser cycle is the default-on cycle — every code-file edit (40+ extensions: .ts, .tsx, .css, .html, .py, .go, .java, …) requires browser-driven verification (navigate / screenshot / aria / console). Run browser disable for projects where you don't want browser-cycle enforcement (e.g. backend-only services where only backend enable / node enable apply). It writes browser.verifyPatterns: [] to override the legacy 40+ extension default; customizations of alwaysRequired / evidencePaths / additionalVerifyPatterns are preserved.

To re-enable: ironbee browser enable — strips the verifyPatterns: [] override so the code defaults (legacy 40+ extension list) flow back in at runtime. config.json stays minimal; the default list is NOT materialized into the file (it lives in code and tracks the CLI version automatically).

Optional: enable runtime-agnostic backend protocol verification

ironbee backend enable

Activates the backend protocol cycle — drives real HTTP / gRPC / GraphQL / WebSocket calls against your running backend service via the backend-devtools MCP (bedt_* tools) and verifies the responses. Works for any backend runtime: Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala. The command writes a minimal { "backend": {} } block to config — code defaults (multi-language paths covering server/**, api/**, routes/**, controllers/**, handlers/**, services/**) flow in at runtime.

To revert: ironbee backend disable (drops the block clean if no customizations / lower-layer override; otherwise hard-kills via verifyPatterns: []).

Optional: enable Node.js runtime debug verification

ironbee node enable

Run this once per project whose backend is Node.js and you want IronBee to gate at the runtime level (V8 inspector probes via node-devtools). It writes a minimal { "node": {} } block to config — code defaults (e.g. server/**, pages/api/**, **/server.{ts,js,mjs,cjs}) flow in at runtime; nothing is materialized into the file. From then on, edits to matching paths require Node-cycle verification (connect + probes/logs) alongside any browser-cycle verification. To customize, set node.verifyPatterns (replaces defaults) or node.additionalVerifyPatterns (appends).

The node cycle is independent of the backend cycle — backend drives the wire protocol from outside, while node attaches to a Node.js process and sets non-blocking debug probes. Both can be enabled simultaneously; both must pass.

To revert: ironbee node disable. With no customizations the entire node block is dropped (clean config). With customizations or a lower-layer override, writes verifyPatterns: [] (hard kill, preserves alwaysRequired / evidencePaths / additionalVerifyPatterns so re-enabling later restores your tuned setup).

Optional: monitoring-only mode (no enforcement)

ironbee verification disable

Turns off enforcement but keeps the telemetry path intact. Session lifecycle and tool-call events still flow to the IronBee Collector, but the agent never sees a verify-gate, skill, rule, or /ironbee-verify command — useful when you want observability without slowing the agent down. To re-enable: ironbee verification enable.

The toggle re-renders all client artifacts (hooks, skill, rule, MCP servers, permissions) atomically. The change takes effect on the next agent session — restart your editor / agent after toggling.

Cursor: additional setup

Cursor requires manual activation of MCP servers after install:

Restart Cursor to load the new hooks and MCP config
Go to Settings → Tools & MCP and verify each registered IronBee server is enabled — browser-devtools is always present on a default install; backend-devtools appears after ironbee backend enable; node-devtools appears after ironbee node enable
If a server shows as enabled but tools are unavailable, toggle it off and on

Note: This is a known Cursor limitation — MCP servers added via mcp.json may need manual activation.

That's it

The next time your AI agent edits code, IronBee will require verification before the task can complete — browser cycle for frontend changes, backend cycle for runtime-agnostic protocol calls (if enabled), Node cycle for Node.js runtime debug (if enabled), or any combination in parallel.

Commands

ironbee install [project-dir] [--client <name>] [--all]   Set up hooks and config; --all → batch across every registered project
ironbee uninstall [project-dir] [--client <name>] [--all] [-y]   Remove hooks and config; --all → batch wipe across every registered project (destructive, prompts unless --yes)
ironbee update                                    Update IronBee CLI to the latest version (npm self-update)
ironbee status [project-dir]                      Show verdict status for active sessions
ironbee verify [session-id]                       Dry-run verdict validation
ironbee analyze [session-id]                      Analyze session metrics (or all sessions)
ironbee import [options]                          Backfill historical Claude sessions to the IronBee Collector (--since / --from / --to, --transcript / --projects / --all-projects, --dry-run, --yes, --force, --concurrency)
ironbee browser <enable|disable>   [-g|--local] [--client <name>]   Manage the browser cycle (default-on; bdt_* tools via browser-devtools)
ironbee backend <enable|disable>   [-g|--local] [--client <name>]   Manage the runtime-agnostic backend protocol cycle (HTTP/gRPC/GraphQL/WS via backend-devtools)
ironbee node <enable|disable>      [-g|--local] [--client <name>]   Manage the Node.js runtime debug cycle (V8 inspector probes via node-devtools)
ironbee verification <enable|disable> [-g|--local] [--client <name>]   Master verification toggle (enable = enforce; disable = monitoring-only, no enforcement but sessions/tools still ship to collector)
ironbee config get   <key>         [-g|--project|--local]   Read a config value (default: merged effective value; flags narrow to one of the three layers)
ironbee config set   <key> <value> [-g|--local] [--client <name>] [--no-rerender] [--json] [--apply-all|--no-apply-all]   Write a config value; auto re-renders client artifacts on artifact-affecting keys; -g writes global, --local writes project-local (gitignored)
ironbee config unset <key>         [-g|--local] [--client <name>] [--no-rerender] [--apply-all|--no-apply-all]   Remove a config value (idempotent); same target / rerender rules as set
ironbee config list                [-g|--project|--local]   Print the entire config (merged / global / project / local)
ironbee config path                [-g|--local]   Print the on-disk path of the targeted config file (project default; -g for global, --local for project-local)
ironbee register   [-p <dir>] [--client <name>]   Add this project to the user-home inventory (no artifact writes)
ironbee unregister [-p <dir>] [--client <name>]   Remove this project from the user-home inventory (no artifact writes)
ironbee queue status [--session <id>]             Queue status per session (counts, recent dead-letter errors)
ironbee queue drain  [--session <id>]             Synchronously drain pending snapshots
ironbee queue dead-letter list|stats|retry|clear  Inspect / retry / clear dead-letter entries

Projects inventory

ironbee install records each project it touches in ~/.ironbee/projects.json; ironbee uninstall removes it. The inventory powers two cross-project workflows:

ironbee install --all — explicit batch op that re-runs install on every registered project. Use after a global config change to propagate it everywhere; uses each project's currently detected clients (or pass --client <name> to override).
ironbee uninstall --all — destructive batch op that wipes ironbee from every registered project. Prompts with default-No before acting; pass --yes / -y to skip the prompt. Refuses without --yes in non-interactive contexts.
Prompt on global config writes — ironbee config set <key> <val> -g (and unset) on an artifact-affecting key (collector, verification, browser, backend, node, browserDevTools, backendDevTools, nodeDevTools) lists up to 10 other registered project paths still on the prior state and asks Apply this change to these N projects now? [Y/n] (default Yes). Pass --apply-all / --no-apply-all to skip the prompt; non-TTY contexts skip it and print a hint pointing at install --all.

For pure inventory bookkeeping (no artifact writes):

ironbee register — adds the current project to the inventory. Useful for projects set up before this feature existed.
ironbee unregister — removes the current project from the inventory. Works on already-deleted project dirs.

Agent Commands (slash commands)

IronBee installs slash commands that the agent can use inside Claude Code or Cursor:

Command	Description
`/ironbee-verify`	Verify changes — focused on affected areas (default)
`/ironbee-verify full`	Full verification — complete visual + functional + accessibility checklists
`/ironbee-verify visual`	Visual-only — contrast, layout, spacing, fonts, images, theming
`/ironbee-verify functional`	Functional-only — clicks, forms, navigation, data flow, error handling
`/ironbee-analyze`	Run session analytics and provide LLM-powered semantic insights

/ironbee-verify guides the agent through a systematic verification process. The default mode focuses on what changed, while full runs every checklist item. Use visual or functional to narrow the scope when you know what type of testing is needed.

Configuration

IronBee loads config from three layers and deep-merges them in order (each later layer overrides the earlier ones), then layers env-var overrides on top:

Global — ~/.ironbee/config.json
Project — <project>/.ironbee/config.json (committed; team-shared)
Project-local — <project>/.ironbee/config.local.json (gitignored; per-machine / per-developer override)
Env-var overrides — selected IRONBEE_* env vars (e.g. IRONBEE_API_KEY → collector.apiKey); env always wins over every file layer. See Env-var overrides below.

The local layer is optional — ironbee install adds .ironbee/config.local.json to .gitignore automatically, but the file is only created when you actually write to it (e.g. ironbee config set ... --local).

{
  "ignoredVerifyPatterns": ["*.test.ts", "*.spec.ts"],
  "maxRetries": 5,

  "browser": {
    "verifyPatterns": ["*.ts", "*.tsx", "*.css"],
    "additionalVerifyPatterns": ["*.mdx"]
  },

  "backend": {
    "verifyPatterns": ["routes/**/*.{go,py,java,ts}", "controllers/**/*.{go,py,java}"]
  },

  "node": {
    "verifyPatterns": ["server/**/*.ts", "pages/api/**/*.ts"]
  },

  "verification": {
    "enable": false
  },

  "fileChange": {
    "captureChangeset": true
  }
}

Key	Description	Default
`browser.verifyPatterns`	Glob patterns for files requiring browser verification (replaces defaults). Four-state semantic: block-absent → code defaults (40+ ext, default-on); block-present + verifyPatterns unset → code defaults (post-`browser enable` shape); `[]` → hard kill (also disables `additionalVerifyPatterns`); custom `[...]` → user-defined.	40+ code extensions when block absent OR `verifyPatterns` unset
`browser.additionalVerifyPatterns`	Extra browser patterns appended to defaults	`[]`
`backend.verifyPatterns`	Glob patterns activating the runtime-agnostic backend protocol cycle (`backend-devtools` MCP, `bedt_` tools — HTTP / gRPC / GraphQL / WebSocket). Same four-state semantic, default-off: block absent → cycle disabled; block present + `verifyPatterns` unset → 13 default patterns from code (multi-language: `routes/`, `controllers/`, `handlers/`, `services/*` across `.ts/.js/.py/.go/.java/.rb/.cs/.rs/.kt/.scala/.ex/.exs/.php/.clj`); `[]` → hard kill; custom `[...]` → user-defined. Opt in via `ironbee backend enable`.	block absent → disabled; block present + unset → 13 code defaults
`backend.additionalVerifyPatterns`	Extra patterns appended to `backend.verifyPatterns` (or to code defaults when verifyPatterns is unset). Ignored when `verifyPatterns: []`.	`[]`
`backend.alwaysRequired`	Backend-cycle required tools (all-of). Empty default — backend uses any-of evidence paths.	`[]`
`backend.evidencePaths`	Alternative tool paths — at least one must be fully satisfied. Defaults: `protocol-call` (any `bedt_request_*`) OR `log-evidence` (`bedt_log_register-source` AND any read/follow) OR `db-evidence` (`bedt_db_connect` AND any inspect tool).	protocol-call OR log-evidence OR db-evidence
`node.verifyPatterns`	Glob patterns activating the Node.js runtime debug cycle (`node-devtools` MCP, `ndt_` tools — V8 inspector probes). Same four-state semantic as `browser.verifyPatterns`, but default-off: block absent → cycle disabled; block present + `verifyPatterns` unset → 9 default patterns from code (`server/`, `pages/api/`, `*/server.{ts,js,mjs,cjs}`, …); `[]` → hard kill; custom `[...]` → user-defined. Opt in via `ironbee node enable`.	block absent → disabled; block present + unset → 9 code defaults
`node.additionalVerifyPatterns`	Extra patterns appended to `node.verifyPatterns` (or to code defaults when verifyPatterns is unset). Ignored when `verifyPatterns: []`.	`[]`
`node.alwaysRequired`	Node-cycle required tools (all-of)	`["ndt_debug_connect"]`
`node.evidencePaths`	Alternative tool paths — at least one must be fully satisfied	probe path + log path
`ignoredVerifyPatterns`	Patterns to exclude from verification (checked first, applies to all cycles)	`[]`
`maxRetries`	Max retry attempts before allowing completion (single global counter regardless of how many cycles run)	`3`
`verification.enable`	Master switch for enforcement. Inverse semantics from `recording`/`jobQueue`/`collector` — verification is the core feature, opt-out via `enable: false`. When disabled, ironbee runs in monitoring-only mode (no enforcement hooks, skill, rule, or MCP servers; only session/activity/tool_call telemetry flows to the collector).	`true`
`fileChange.captureChangeset`	When `true`, every `file_change` event carries a hunks-only unified-diff `changeset` string (`@@` headers + `space`/`-`/`+` lines, no filename header — `file_path` already lives on the parent event). Off by default — the default `tool_input` whitelist deliberately strips file content from the wire; turning this on routes content through `file_change` instead. PreToolUse pre-reads the file when enabled so PostToolUse can produce a real before/after diff (Write/Edit on Claude; Write/StrReplace/Delete on Cursor). Skipped on binary content (NUL byte in first 4 KB).	`false`
`fileChange.maxChangesetBytes`	Hard cap on the `changeset` string size. Diffs over the cap are sliced on a UTF-8 byte boundary and end with a `\n... (truncated, N bytes omitted)\n` footer so the collector POST stays within typical reverse-proxy body limits.	`65536` (64 KB)

Editing config from the CLI (`ironbee config`)

You can edit any of the three config layers via the CLI instead of hand-rolling JSON:

# Read the effective (merged) value across all three layers
ironbee config get collector.url

# Write to project config (default — committed, team-shared)
ironbee config set collector.url https://collector.example.com
ironbee config set maxRetries 5
ironbee config set verification.enable false
ironbee config set browser.verifyPatterns '["*.ts", "*.tsx", "*.css"]'

# Write to global config (~/.ironbee/config.json)
ironbee config set collector.apiKey sk-... --global

# Write to project-local config (<project>/.ironbee/config.local.json — gitignored, per-machine)
ironbee config set collector.url http://localhost:4000 --local

# Remove a value (idempotent — no-op if absent)
ironbee config unset collector.url            # project layer
ironbee config unset collector.url --local    # local layer

# Inspect (default reads merged effective; flags narrow to a single layer)
ironbee config list                # merged effective config across all three layers
ironbee config list --global       # global file only
ironbee config list --project      # project file only
ironbee config list --local        # project-local file only
ironbee config path                # print the project config file path
ironbee config path --local        # print the project-local config file path

Target flags are mutually exclusive: pass at most one of -g/--global, --project (read-only — --project is the default for writes), or --local.

Type coercion — set parses the value as JSON when it can (true/42/[…]/{…}) and falls back to a raw string when JSON parse fails. URLs and paths pass through unquoted; pass --json to force strict JSON parsing (e.g. when you want the literal string "42" instead of the number 42).

Smart artifact re-render — when a top-level key affects installed client artifacts (verification, collector, browser, backend, node, browserDevTools, backendDevTools, nodeDevTools), set and unset re-render the client files (hooks, MCP entries, skill, rule, permissions) automatically — same code path verification enable / backend enable / node enable use. Other keys (maxRetries, recording, jobQueue, analytics, import, ignoredVerifyPatterns) are pure config flips that the next agent session picks up — no rerender needed.

Pass --no-rerender to skip the rerender on artifact-affecting keys (handy for scripted bulk edits — follow up with ironbee install to resync). If a rerender fails midway, the config file is rolled back to its prior bytes so disk state never diverges from installed artifacts.

Restart your editor / agent session after changing artifact-affecting keys — the host caches hook config at session start, so the new state takes effect on the next run.

Env-var overrides

A small allowlist of IRONBEE_* env vars overrides specific config paths on top of the three file layers. Useful for secrets that shouldn't be committed (CI runners, ephemeral shells, multi-env desktop setups). Set to a non-empty string to override; unset or empty-string falls back to the file value. Env always wins over every file layer.

Env var	Config path	Notes
`IRONBEE_API_KEY`	`collector.apiKey`	Lets CI / per-shell setups supply the collector API key without committing it. Combined with a file-set `collector.url`, the merged effective config has both required fields.

# Use a one-shot key for this shell only
export IRONBEE_API_KEY=sk-...
ironbee config get collector.apiKey         # returns the env value (merged read)
ironbee config get collector.apiKey --project   # returns only what's in the project file (env bypassed)

Layer-specific reads (--global / --project / --local) bypass env overrides and show only what's on disk in that layer. The default merged read surfaces the env value when set, so it always reflects what the runtime will actually use.

ironbee config set / unset warn when the targeted path is shadowed by a live env override — the file write still succeeds, but the operator's value won't take effect until the env var is unset.

Default verify patterns

By default, the browser cycle is enabled and matches common code file extensions: .ts, .tsx, .js, .jsx, .css, .scss, .html, .py, .go, .rs, .java, .vue, .svelte, and many more (DEFAULT_BROWSER_VERIFY_PATTERNS). Backend file edits trigger browser verification by default since they often affect frontend behavior. Run ironbee browser disable for projects where the browser-cycle gate isn't appropriate (e.g. backend-only services); ironbee browser enable re-enables.

Patterns are NOT materialized into config.json — they live in the CLI source (DEFAULT_BROWSER_VERIFY_PATTERNS / DEFAULT_BACKEND_VERIFY_PATTERNS / DEFAULT_NODE_VERIFY_PATTERNS) and flow in at runtime when the cycle block exists without an explicit verifyPatterns key. Keeps config.json minimal AND lets defaults track CLI updates automatically (no frozen-at-install-time drift). To customize, set the explicit <cycle>.verifyPatterns (replaces defaults) or <cycle>.additionalVerifyPatterns (appends).

The backend cycle is opt-in via ironbee backend enable and is runtime-agnostic (drives wire protocols via backend-devtools). The node cycle is opt-in via ironbee node enable (only meaningful for Node.js backends — node-devtools is a V8 inspector wrapper).

Non-code files like README.md, package.json, or .gitignore do not trigger any cycle.

Devtools MCP server config

IronBee can register up to three MCP server entries from the same @ironbee-ai/devtools package (IronBee DevTools) — browser-devtools (bdt_ prefix, browser mode), backend-devtools (bedt_ prefix, runtime-agnostic backend mode), and node-devtools (ndt_ prefix, Node mode). Each is per-cycle gated (only enabled cycles get an entry) and can be customized independently via its own config block.

For the browser server, use browserDevTools:

{
  "browserDevTools": {
    "mcp": {
      "url": "http://localhost:4000/mcp"
    }
  }
}

For the backend server, use backendDevTools:

{
  "backendDevTools": {
    "env": { "BACKEND_DEFAULT_HOST": "http://localhost:8080" }
  }
}

For the node server, use nodeDevTools:

{
  "nodeDevTools": {
    "env": { "NODE_INSPECTOR_HOST": "127.0.0.1" }
  }
}

You can mix-and-match: full config replacement via mcp, or just env-var additions via env. The two blocks below combine — one uses mcp for full replacement on the browser server, the other adds env vars to the backend server:

{
  "browserDevTools": {
    "mcp": {
      "command": "node",
      "args": ["./my-server.js"],
      "env": { "MY_VAR": "value" }
    }
  },
  "backendDevTools": {
    "env": { "OTEL_ENABLE": "true" }
  }
}

Key	Description
`browserDevTools.mcp` / `backendDevTools.mcp` / `nodeDevTools.mcp`	Full MCP server config — used as-is when provided. Supports `command`+`args` (stdio) or `url` (HTTP)
`browserDevTools.env` / `backendDevTools.env` / `nodeDevTools.env`	Extra env vars merged into the default config. Only used when `mcp` is not provided

Note: IronBee always sets TOOL_NAME_PREFIX (bdt_ / bedt_ / ndt_), TOOL_INPUT_METADATA_ENABLE=true, and PLATFORM (browser / backend / node) — these cannot be overridden. When collector is configured, an OTEL exporter env block is also auto-injected on every server entry; operators can override individual OTEL_* keys via the env block above.

Verification Flow (multi-cycle)

When the agent tries to complete a task, IronBee runs these checks:

Were code files edited? — If no matching files were changed, the agent completes normally.
Which cycles are active? — IronBee matches each edited file against browser.verifyPatterns and (if you opted in) backend.verifyPatterns and/or node.verifyPatterns. A single file may activate two or three cycles; they all run in parallel and pass/fail combine with AND.
Were the cycle's required tools used?
- Browser cycle: navigate, screenshot, accessibility snapshot, console check (all-of)
- Backend cycle: at least one evidence path must be fully exercised — protocol-call (any one of bedt_request_http / bedt_request_grpc / bedt_request_graphql / bedt_request_websocket-open / bedt_request_replay), OR log-evidence (bedt_log_register-source AND any one of bedt_log_read / bedt_log_read-multi / bedt_log_follow), OR db-evidence (bedt_db_connect AND any one of bedt_db_query / bedt_db_describe-table / bedt_db_list-tables / bedt_db_snapshot / bedt_db_diff / bedt_db_get-changes)
- Node cycle: connect; then either probe path ((put-tracepoint | put-logpoint | put-exceptionpoint) AND get-probe-snapshots) OR log path (get-logs)
Does a verdict exist? — The agent must submit a single verdict via ironbee hook submit-verdict.
Is the verdict valid? — Required: status ∈ {pass, fail} + checks (non-empty array). On fail, issues is required; on pass-after-fail, fixes is required.
Pass or fail? — Server-derived pass criteria from tool_call records is currently a no-op stub (TODO — see verify-gate.ts). For now status: "pass" is honored as-is. When evidence extractors land, per-cycle pass criteria (zero console errors, probe triggered, evidence path exercised) will be derived from the agent's tool_calls and override status: pass to fail when criteria don't hold.
Retry limit — After maxRetries failed attempts (default 3, single global counter), the agent is allowed to complete but must report unresolved issues.

Verdict format

Verdicts are platform-agnostic — the same minimal shape regardless of which cycles (browser / backend / node / multi-cycle) ran. Structural evidence (pages tested, console error counts, endpoints called, log sources, DB connections, probe snapshots, …) is intentionally NOT part of the verdict — the gate (will) derive it from the tool_call records of your bdt_* / bedt_* / ndt_* invocations, so the agent cannot misreport it.

Submit via echo '<json>' | ironbee hook submit-verdict:

{
  "session_id": "<your-session-id>",
  "status": "pass",
  "checks": ["form submits successfully", "new item appears in list"]
}

On failure, include an issues array describing what went wrong:

{
  "session_id": "<your-session-id>",
  "status": "fail",
  "checks": ["form renders", "submit button unresponsive"],
  "issues": ["button click handler not firing", "TypeError in console"]
}

On pass after a previous fail, include a fixes array describing what was fixed:

{
  "session_id": "<your-session-id>",
  "status": "pass",
  "checks": ["form submits successfully", "new item appears in list"],
  "fixes": ["reattached click handler to submit button", "fixed TypeError in event handler"]
}

Multi-cycle (e.g. browser + backend + node all active in the same turn): same single verdict. Cycles are derived from the file_changes you made; pass criteria for each is derived from your tool_calls.

The agent must submit a verdict after every verification attempt — both pass and fail. File edits are blocked until a verdict is submitted after using devtools tools.

Session Isolation

Each AI session gets its own directory under .ironbee/sessions/<session-id>/:

.ironbee/sessions/<session-id>/
  actions.jsonl    # Event log (file edits, tool calls, verification markers)
  verdict.json     # Current verdict (cleared on code edit)
  state.json       # Session state (retries, activeVerificationId, activeTraceId,
                   #                lastVerdictStatus, activeFixId, activeActivityId,
                   #                phase, active, recordingRequired, recordingActive,
                   #                userEmail, usageType, usagePlan)
  session.log      # Debug log
  queue/           # File-backed job queue (jobs.jsonl, dead-letter.jsonl, worker.log)
  analytics/       # Per-session analytics state + analytics.log

This means parallel sessions (e.g., multiple Claude Code instances) don't interfere with each other.

Analytics

ironbee analyze provides metrics about verification sessions — how time is spent, how effective verifications are, and how confident we can be in the agent's code.

Usage

ironbee analyze <session-id>                    # single session analysis
ironbee analyze                                 # all sessions (project-level)
ironbee analyze --json                          # JSON output
ironbee analyze --detailed                      # include verdict details (checks, issues, fixes)
ironbee analyze --json --detailed               # JSON with verdict text for LLM semantic analysis
ironbee analyze <session-id> --json --detailed  # single session JSON with verdict details

The --detailed flag includes raw verdict text (checks, issues, fixes) in the output. This is designed for LLM-powered semantic analysis — use /ironbee-analyze in Claude Code or Cursor to have the agent interpret these details automatically.

Session Analysis

Phase Distribution

Each session is divided into three phases:

Phase	What it measures
Coding	Time from session start to first verification, and between fix end and next verification start
Verification	Time between `verification_start` and `verification_end` — browser testing
Fixing	Time between `fix_start` and `fix_end` — fixing failed verifications

Cycles

Metric	Meaning
Verifications	Number of verification cycles in the session
Fixes	Number of fix cycles (each fail verdict starts a fix)
Avg verify	Average duration of a verification cycle
Avg fix	Average duration of a fix cycle
First verify	Time from session start to first verification

Verification Quality

Metric	Meaning
First-pass rate	Percentage of verification chains where the first verdict was pass
Verdicts	Total verdict count (pass + fail)
Avg retries	Average number of fail verdicts before pass per chain
Avg checks	Average number of checks performed per verdict

Per-cycle structural metrics (avg console errors, avg network failures, avg pages tested) are temporarily absent — they depended on agent-claimed evidence that has been removed from the verdict shape. They will return when verify-gate derives structural evidence from tool_call records (TODO).

Code Changes

Metric	Meaning
Total edits	Total file edit operations in the session
Unique files	Number of distinct files edited
Avg per verify	Average file edits before each verification
Avg per fix	Average file edits during each fix cycle
Hot Files	Top 5 most frequently edited files
Problematic Files	Top 5 files with most edits during fix cycles
Edit Churn	Files edited in 2+ separate fix cycles (root cause may not be resolved)

Fix Effectiveness

Metric	Meaning
Success rate	Percentage of fixes followed by a pass verdict
Re-fail rate	Percentage of fixes followed by another fail verdict
Fix/verify	Ratio of fix cycles to verification cycles (0 = no fixes needed)

Scoring

Three scores summarize the session:

Score	Formula	What it measures
Efficiency	`coding_time / (coding_time + fix_time) × 100`	How much productive time vs fix overhead. High = minimal wasted time on fixes
Quality	`(pass_pct + checks_pct) / 2`	How thorough the verification was. Components: pass rate, check depth (5+ checks = 100%). Page-coverage and error-cleanliness components were temporarily removed (depended on agent-claimed evidence) — they'll return when verify-gate derives structural evidence from `tool_call` records.
Confidence	`pass_count / total_verdicts × 100`	How likely the agent's code works. Based on verdict pass rate

Project Analysis

When run without a session ID, ironbee analyze aggregates metrics across all sessions:

Metric	Meaning
Session History	Each session's summary — duration, cycles, outcome, score
Avg duration	Average session duration across all sessions
Avg verifies	Average verification cycles per session
Avg fixes	Average fix cycles per session
First-pass rate	Percentage of sessions where the first verdict was pass
Fix success rate	Percentage of all fixes (across sessions) that succeeded
Abandon rate	Percentage of sessions with interrupted verification/fix cycles
Avg efficiency	Average efficiency score across all sessions
Avg confidence	Average confidence score across all sessions
Problematic Files	Top 5 files with most fix edits across all sessions

Telemetry

IronBee collects anonymous usage data to help improve the product. No source code, file contents, or personally identifiable information is ever sent.

Events collected: install/uninstall, session start, verdict submissions (pass/fail status only), and verification gate decisions.

To opt out, set the environment variable:

export IRONBEE_TELEMETRY=false

Or set telemetryEnabled: false in ~/.ironbee/telemetry.json.

Development

Requires Node.js ≥ 22 (Node 20 hit EOL on 2026-04-30).

npm install
npm run build       # tsc + scripts/copy-assets.js (mirrors .md/.mdc + assets/ to dist/)
npm run lint        # ESLint
npm run test        # Jest (unit + integration + client tests)
npm run dev         # Run via ts-node

CI runs the full test suite across linux × x64/arm64, darwin (Apple Silicon), and windows × x64/arm64 with Node 22 and 24. The build script is pure Node (no bash) so npm run build produces identical output on every OS.

License

Elastic License 2.0 (ELv2) — free to use, copy, modify, distribute, and embed in your own products. The only restriction is that you may not offer IronBee itself (or a substantially similar derivative) as a hosted or managed service to third parties.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.claude		.claude
.github/workflows		.github/workflows
assets		assets
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.release-it.json		.release-it.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
jest.config.ts		jest.config.ts
llms-full.txt		llms-full.txt
llms.txt		llms.txt
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

IronBee CLI

Demo

Supported Clients

Quick Start

Install IronBee globally

Set up a project

Optional: backfill historical sessions

Common scenarios

Flag groups

Optional: opt out of the browser cycle

Optional: enable runtime-agnostic backend protocol verification

Optional: enable Node.js runtime debug verification

Optional: monitoring-only mode (no enforcement)

Cursor: additional setup

That's it

Commands

Projects inventory

Agent Commands (slash commands)

Configuration

Editing config from the CLI (ironbee config)

Env-var overrides

Default verify patterns

Devtools MCP server config

Verification Flow (multi-cycle)

Verdict format

Session Isolation

Analytics

Usage

Session Analysis

Phase Distribution

Cycles

Verification Quality

Code Changes

Fix Effectiveness

Scoring

Project Analysis

Telemetry

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 35

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Editing config from the CLI (`ironbee config`)

Packages