Agent Mode

agent-browser works with any AI coding agent. Use --json for machine-readable output.

Compatible agents

Claude Code
Cursor
GitHub Copilot
OpenAI Codex
Google Gemini
opencode
Any agent that can run shell commands

JSON output

agent-browser snapshot --json
# {"success":true,"data":{"snapshot":"...","refs":{...}}}

agent-browser get text @e1 --json
agent-browser is visible @e2 --json

Optimal workflow

# 1. Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot -i --json   # AI parses tree and refs

# 2. AI identifies target refs from snapshot
# 3. Execute actions using refs
agent-browser click @e2
agent-browser fill @e3 "input text"

# 4. Get new snapshot if page changed
agent-browser snapshot -i --json

Integration

Just ask

The simplest approach:

Use agent-browser to test the login flow. Run agent-browser --help to see available commands.

The --help output is comprehensive.

AGENTS.md / CLAUDE.md

For consistent results, add to your instructions file:

## Browser Automation

Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.

Core workflow:
1. `agent-browser open <url>` - Navigate to page
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
4. Re-snapshot after page changes

Claude Code skill

For richer context:

cp -r node_modules/agent-browser/skills/agent-browser .claude/skills/

Or download:

mkdir -p .claude/skills/agent-browser
curl -o .claude/skills/agent-browser/SKILL.md \
  https://raw.githubusercontent.com/vercel-labs/agent-browser/main/skills/agent-browser/SKILL.md