CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Always use context7 when I need code generation, setup or configuration steps, or library/API documentation. This means you should automatically use the Context7 MCP tools to resolve library id and get library docs without me having to explicitly ask.

Project Overview

Actionbook is a website UI Action service platform that provides AI Agents with accurate, real-time website operation information (element selectors, operation methods, page structure). The core value proposition: "Let AI Agents precisely operate any website without repeatedly learning page structures."

Actionbook bridges AI Agents and website Action libraries through the MCP protocol, so Agents can directly obtain verified selectors and operation methods without parsing pages each time.

Architecture

Domain Model Hierarchy

Site → Page → Element → ElementAction
                    ↘ Scenario → ScenarioStep

Site: Website domain with metadata, health score, tags
Page: Functional page type with URL patterns
Element: Interactive UI element with semantic ID
ElementAction: Selectors (css, xpath, ariaLabel, dataTestId) and allowed methods (click, type, etc.)
Scenario: Complete user operation flow composed of multiple steps

Module Structure

actionbook/
├── packages/
│   ├── js-sdk/             # @actionbookdev/sdk - JavaScript SDK with tool definitions
│   ├── mcp/                # @actionbookdev/mcp - MCP Server (standalone, publishable to npm)
│   ├── cli/                   # @actionbookdev/cli - Command line interface (Rust CLI + npm wrapper)
│   └── tools-ai-sdk/       # @actionbookdev/tools-ai-sdk - Vercel AI SDK tools integration
├── playground/             # Demo and example projects
│   ├── rust-learner/       # Rust learner plugin example
│   └── stagehand-agent/    # Stagehand agent example
└── eval/                   # Evaluation framework

Core MCP Tools

search_actions - Search for actions by keyword
get_action_by_id - Get action content by ActionId

ActionId format: site/{domain}/page/{pageType}/element/{semanticId}

Data Flow

Agent calls MCP tool search_actions or get_action_by_id
MCP Server forwards to API Service
API Service queries database
Returns ActionMeta/ActionContent to Agent
Agent uses returned selectors to execute operations

Technology Stack

Language: TypeScript 5.x, Node.js 20+
Package Manager: pnpm (monorepo)
Build System: Turborepo (caching, parallel builds)
MCP Protocol: @modelcontextprotocol/sdk
AI SDK: Vercel AI SDK
Validation: Zod schemas

Development Commands

# Install dependencies
pnpm install

# Development mode (all packages)
pnpm dev

# Build all (with caching via Turborepo)
pnpm build

# Run tests
pnpm test

# Lint all packages
pnpm lint

# Clean all build outputs
pnpm clean

Turborepo Commands

# Build specific package
pnpm build --filter=@actionbookdev/sdk

# Build package and its dependencies
pnpm build --filter=@actionbookdev/mcp...

Package-Specific Commands

JavaScript SDK (packages/js-sdk):

cd packages/js-sdk
pnpm build            # Build the SDK
pnpm test             # Run tests

MCP Server (packages/mcp):

cd packages/mcp
pnpm build            # Build MCP server
pnpm test             # Run tests

CLI (packages/cli):

cd packages/cli
pnpm build            # Build CLI
pnpm test             # Run tests

AI SDK Tools (packages/tools-ai-sdk):

cd packages/tools-ai-sdk
pnpm build            # Build AI SDK tools
pnpm test             # Run tests

Key Design Decisions

SDK + MCP separation: @actionbookdev/sdk provides core types and tool definitions; @actionbookdev/mcp depends on SDK for MCP protocol implementation
Query-only MCP: MCP provides queries only; Agent executes operations itself
AI SDK integration: @actionbookdev/tools-ai-sdk provides Vercel AI SDK compatible tools

Environment Variables

Each package manages its own .env file. Check the .env.example in each package.

Development Workflow

This is a pnpm workspace with Turborepo - always use pnpm instead of npm or yarn
Node.js 18+ required (20+ recommended), pnpm 10+ required
Copy .env.example to .env in each package you're working with
Check for existing CLAUDE.md files in subdirectories for package-specific guidance
Follow existing code patterns and conventions in each workspace

File Organization

IMPORTANT: When creating files during development, follow these conventions:

Documentation

All documentation files created during implementation should be placed in:

.docs/

This includes:

Architecture documentation
Implementation guides
API documentation
Design decisions
Troubleshooting guides

Published Packages

The following packages are published to npm:

Package	npm Name	Description
`packages/js-sdk`	`@actionbookdev/sdk`	Core SDK with types and tool definitions
`packages/mcp`	`@actionbookdev/mcp`	MCP Server implementation (CLI: `actionbook-mcp`)
`packages/cli`	`@actionbookdev/cli`	Command line interface
`packages/tools-ai-sdk`	`@actionbookdev/tools-ai-sdk`	Vercel AI SDK tools integration

Release Workflow

This project uses Changesets for versioning and GitHub Actions for automated publishing. Manual publishing is not supported — all releases go through CI.

Developer Steps

# 1. After making changes, create a changeset
pnpm changeset
# Select affected packages, semver bump type (major/minor/patch), and description

# 2. Commit the generated .changeset/*.md file along with your changes
git add .changeset/*.md
git commit -m "[scope]feat: description"

# 3. Push/merge to main — CI handles the rest

CI Release Process (Two-Phase)

Phase 1 — Version PR: When changesets are detected on main, CI automatically creates a "Version Packages" PR that updates package versions, CHANGELOGs, and runs scripts/sync-versions.js to keep derived manifests in sync.

Phase 2 — Publish: After the Version PR is merged (no pending changesets), CI compares local versions against the npm registry and publishes only changed packages:

Type	Packages	Publish Target
JS packages	`sdk`, `mcp`, `tools-ai-sdk`, `json-ui`	npm (`npm publish --provenance`)
CLI	`@actionbookdev/cli` + 6 platform packages	Rust cross-compile → npm + GitHub Release
Extension	`actionbook-extension`	ZIP → GitHub Release
Dify Plugin	`dify-plugin`	`.difypkg` → GitHub Release

Version Sync

scripts/sync-versions.js runs automatically after changeset version to sync:

CLI main version → 6 platform-specific package.json files
Extension package.json → manifest.json
Dify plugin package.json → manifest.yaml + pyproject.toml

Git Branch Naming Convention

Branch names MUST follow these formats:

feature/xxx — new features
bugfix/xxx — bug fixes
release/x.x.x — release branches

When creating worktrees, the branch name should match the convention (e.g., feature/cli-package-json).

Git Commit Message Convention

IMPORTANT: This is a monorepo. All commit messages MUST follow this format:

[scope]type: description

[optional body]

[optional footer]

[scope]: The workspace/package path in square brackets, or [root] for root-level files
- Workspace examples: [packages/js-sdk], [packages/mcp], [playground/rust-learner]
- Root-level: [root] (for files like CLAUDE.md, package.json, tsconfig.json, etc.)
type: Conventional commit type (feat, fix, docs, refactor, test, chore, etc.)
description: Brief description of the change

Examples:

[packages/js-sdk]fix: correct ESM export path in package.json
[packages/mcp]feat: add new tool for action search
[playground/rust-learner]docs: update README with setup instructions
[root]docs: update CLAUDE.md
[root]chore: update pnpm-workspace.yaml

Multi-package changes: Use the primary affected package as scope.

Actionbook CLI

In packages/cli/, should follow the rules

Product Principles

Stateless interface, stateful runtime.
The CLI interface facing agents is completely stateless — every command explicitly addresses via --session and --tab, is self-contained, and depends on no side effects from prior commands. The daemon itself is stateful,
holding the CDP connection pool and session/tab registry. The key distinction: the agent doesn't need to track any state; the daemon manages it on its behalf.
Absolute-path addressing, like a filesystem.
The core analogy: humans open a file in an IDE before editing; agents call write(full_path, content) directly. All per-tab commands must include --session + --tab — omitting either is an error. There is no implicit
"current tab" concept. This eliminates global locks and makes multi-tab parallel operations a first-class citizen.
Designed for LLM consumers, not humans.
Output defaults to compact text rather than JSON, because LLMs consume tokens more efficiently that way. Every response carries a [session tab] url prefix so the agent always knows its context. Short IDs (s0, t3) replace UUIDs, compressing addressing from 40+ tokens down to 3–4.
Errors as guidance.
Every error response includes a hint field telling the agent what to do next. For example, SESSION_NOT_FOUND suggests run browser launch. This reduces the tokens agents waste on error recovery.

Architecture

Three-layer decoupling. CLI layer, Daemon layer, and Browser layer are independent, interacting only through protocols and trait boundaries. CLI layer handles argument parsing and output formatting only — no browser access. Daemon layer handles IPC routing and connection management only — no command semantics. Browser layer abstracts backends via traits — agnostic to whether commands come from CLI or daemon. Anti-pattern: calling chromiumoxide::Browser::connect() then println! directly in a command handler — that couples all three layers.

Rust Engineering Constraints

Use mature crates, but stay vigilant. clap derive for CLI, figment for config, thiserror for errors. Be cautious with chromiumoxide — it's 0.8, not 1.0. Its full CDP codegen causes slow compilation and binary bloat. If it falls behind Chrome's CDP evolution, have a fallback plan: thin WS client + serde_json::Value. The current Cargo.toml depending on both chromiumoxide and tokio-tungstenite is a redundancy signal — chromiumoxide uses tungstenite internally. If you also need raw WS, its abstraction is leaking.
Phase out async-trait, except for dyn Trait. Rust 1.75+ natively supports async fn in trait (RPITIT) — prefer it for new code. However, BrowserBackend requires dynamic dispatch (Box<dyn BrowserBackend>), which still needs async-trait. Rule: use RPITIT internally, keep async-trait only for dyn-exposed interfaces.
Feature flags must gate compilation. The current stealth = [] is an empty feature — stealth code is always compiled. Use #[cfg(feature = "stealth")] to gate entire modules. camoufox does this correctly (gates thirtyfour dependency); stealth should follow.
panic = "abort" means Drop won't run. Daemon socket/PID file cleanup cannot rely on Drop guards. Must use signal handlers (SIGTERM/SIGINT) for explicit cleanup. catch_unwind is unavailable — all errors must go through Result.
Graceful shutdown is mandatory. Register tokio::signal for SIGINT/SIGTERM. On signal, daemon must close controlled browser instances and clean up socket/PID files to prevent zombie processes. This pairs with the panic=abort constraint — Drop cannot be relied on for cleanup.
opt-level = "z" is the right choice. For a network-IO-bound CLI, CPU is rarely the bottleneck. If snapshot parsing or fingerprint generation shows performance issues, confirm with criterion benchmarks first, then consider splitting hot functions into a separate crate with a different opt-level. Don't change the whole profile because of one slow function.
Use edition 2024. MSRV: latest stable minus one. This is a CLI tool, not a library depended on by other crates.
No blocking IO in async context. Daemon hot paths (socket read/write, CDP message processing) must use tokio::fs. Cold paths (config loading, PID files) may use std::fs, but that's a deliberate tradeoff, not an oversight.
Keep terminal UI deps minimal. console, indicatif, dialoguer are sufficient. Don't also pull in colored (overlaps with console). This is a CLI tool, not a TUI app.
No unsafe outside FFI. The only permitted unsafe usages are Native Messaging cross-process communication and libc signal handlers.

Output & Errors

Dual-channel output. stdout for machines, stderr for humans. When --json is set, stdout must emit only valid JSON — zero pollution. Logs, progress, warnings all go to stderr. Anti-pattern: printing Starting browser... to stdout in --json mode, breaking jq parsing.
Errors must be typed. Every error has a distinct variant name and machine-readable code (e.g. browser_not_found, cdp_connection_failed). Library layer uses thiserror; CLI top-level may use anyhow for propagation, but bare Err("something failed".into()) or anyhow! is forbidden in library code. AI agents rely on error codes for retry strategy — string errors are useless.
Flags use kebab-case, env vars use ACTIONBOOK_ prefix. Every global flag must support both CLI and env input. clap's #[arg(env = "...")] handles this directly.
Config precedence: CLI flag > env var (ACTIONBOOK_*) > config file (~/.actionbook/config.toml) > defaults. figment natively supports this chain.
Use tracing, not log. In daemon mode, structured logs must carry session_id and profile_name fields — otherwise concurrent sessions produce indistinguishable logs. Level controlled via RUST_LOG or --verbose.

Security & Feature Boundaries

CDP method security levels. L1: read-only (screenshot, DOM read). L2: modification (click, input, navigation). L3: high-risk (file download, permission grant). L3 is denied by default in Extension Bridge mode. Anti-pattern: no level enforcement in Extension Bridge, allowing Browser.grantPermissions — attacker gains camera access via malicious page.
Stealth off by default. Anti-detection is opt-in. Users must explicitly enable it. Anti-pattern: stealth on by default causes non-reproducible E2E tests — User-Agent differs on every run.
Binary size is a hard constraint. Target < 10MB (stripped). Means: opt-level="z" + LTO + strip + panic="abort" + feature-gate non-core modules. Periodically run cargo bloat to track size contributors.

gstack

Available Skills

/office-hours - Brainstorm and explore ideas
/plan-ceo-review - CEO/founder-mode plan review
/plan-eng-review - Engineering manager plan review
/plan-design-review - Design plan review
/design-consultation - Create a design system
/review - Pre-landing PR review
/ship - Ship workflow (merge, test, bump, PR)
/land-and-deploy - Land and deploy changes
/canary - Canary deployment
/benchmark - Run benchmarks
/qa - QA testing
/qa-only - QA testing (test only, no fixes)
/design-review - Visual design audit
/setup-browser-cookies - Set up browser cookies
/setup-deploy - Set up deployment
/retro - Weekly engineering retrospective
/investigate - Debug and investigate errors
/document-release - Post-ship documentation
/codex - Adversarial code review
/careful - Production/live systems safety mode
/freeze - Scope edits to one module/directory
/guard - Maximum safety mode
/unfreeze - Remove edit restrictions
/gstack-upgrade - Upgrade gstack to latest version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Architecture

Domain Model Hierarchy

Module Structure

Core MCP Tools

Data Flow

Technology Stack

Development Commands

Turborepo Commands

Package-Specific Commands

Key Design Decisions

Environment Variables

Development Workflow

File Organization

Documentation

Published Packages

Release Workflow

Developer Steps

CI Release Process (Two-Phase)

Version Sync

Git Branch Naming Convention

Git Commit Message Convention

Actionbook CLI

Product Principles

Architecture

Rust Engineering Constraints

Output & Errors

Security & Feature Boundaries

gstack

Available Skills

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Architecture

Domain Model Hierarchy

Module Structure

Core MCP Tools

Data Flow

Technology Stack

Development Commands

Turborepo Commands

Package-Specific Commands

Key Design Decisions

Environment Variables

Development Workflow

File Organization

Documentation

Published Packages

Release Workflow

Developer Steps

CI Release Process (Two-Phase)

Version Sync

Git Branch Naming Convention

Git Commit Message Convention

Actionbook CLI

Product Principles

Architecture

Rust Engineering Constraints

Output & Errors

Security & Feature Boundaries

gstack

Available Skills