We're hosting The Computer-Use Agents SOTA Challenge concluded at Hack the North and online!
Track A (On-site @ UWaterloo): 🏆
Prize: YC interview guaranteed.Concluded
Track B (Remote): 🏆Prize: Cash award.Concluded - Winners will be announced soon
👉 Sign up here: trycua.com/hackathon
cua ("koo-ah") is Docker for Computer-Use Agents - it enables AI agents to control full operating systems in virtual containers and deploy them locally or to the cloud.
vibe-photoshop.mp4
With the Computer SDK, you can:
- automate Windows, Linux, and macOS VMs with a consistent, pyautogui-like API
- create & manage VMs locally or using cua cloud
With the Agent SDK, you can:
- run computer-use models with a consistent schema
- benchmark on OSWorld-Verified, SheetBench-V2, and more with a single line of code using HUD (Notebook)
- combine UI grounding models with any LLM using composed agents
- use new UI agent models and UI grounding models from the Model Zoo below with just a model string (e.g.,
ComputerAgent(model="openai/computer-use-preview")
) - use API or local inference by changing a prefix (e.g.,
openai/
,openrouter/
,ollama/
,huggingface-local/
,mlx/
, etc.)
All-in-one CUAs | UI Grounding Models | UI Planning Models |
---|---|---|
anthropic/claude-sonnet-4-5-20250929 , anthropic/claude-haiku-4-5-20251001 |
huggingface-local/xlangai/OpenCUA-{7B,32B} |
any all-in-one CUA |
openai/computer-use-preview |
huggingface-local/HelloKKMe/GTA1-{7B,32B,72B} |
any VLM (using liteLLM, requires tools parameter) |
openrouter/z-ai/glm-4.5v |
huggingface-local/Hcompany/Holo1.5-{3B,7B,72B} |
any LLM (using liteLLM, requires moondream3+ prefix ) |
gemini-2.5-computer-use-preview-10-2025 |
any-all-in-one CUA | |
huggingface-local/OpenGVLab/InternVL3_5-{1B,2B,4B,8B,...} |
||
huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B |
||
moondream3+{ui planning} (supports text-only models) |
||
omniparser+{ui planning} |
||
{ui grounding}+{ui planning} |
human/human
→ Human-in-the-Loop
Missing a model? Raise a feature request or contribute!
- Get started with a Computer-Use Agent UI
- Get started with the Computer-Use Agent CLI
- Get started with the Python SDKs
Usage (Docs)
pip install cua-agent[all]
from agent import ComputerAgent
agent = ComputerAgent(
model="anthropic/claude-3-5-sonnet-20241022",
tools=[computer],
max_trajectory_budget=5.0
)
messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]
async for result in agent.run(messages):
for item in result["output"]:
if item["type"] == "message":
print(item["content"][0]["text"])
{
"output": [
# user input
{
"role": "user",
"content": "go to trycua on gh"
},
# first agent turn adds the model output to the history
{
"summary": [
{
"text": "Searching Firefox for Trycua GitHub",
"type": "summary_text"
}
],
"type": "reasoning"
},
{
"action": {
"text": "Trycua GitHub",
"type": "type"
},
"call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
"status": "completed",
"type": "computer_call"
},
# second agent turn adds the computer output to the history
{
"type": "computer_call_output",
"call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
"output": {
"type": "input_image",
"image_url": "data:image/png;base64,..."
}
},
# final agent turn adds the agent output text to the history
{
"type": "message",
"role": "assistant",
"content": [
{
"text": "Success! The Trycua GitHub page has been opened.",
"type": "output_text"
}
]
}
],
"usage": {
"prompt_tokens": 150,
"completion_tokens": 75,
"total_tokens": 225,
"response_cost": 0.01,
}
}
Computer (Docs)
pip install cua-computer[all]
from computer import Computer
async with Computer(
os_type="linux",
provider_type="cloud",
name="your-sandbox-name",
api_key="your-api-key"
) as computer:
# Take screenshot
screenshot = await computer.interface.screenshot()
# Click and type
await computer.interface.left_click(100, 100)
await computer.interface.type("Hello!")
- How to use the MCP Server with Claude Desktop or other MCP clients - One of the easiest ways to get started with Cua
- How to use OpenAI Computer-Use, Anthropic, OmniParser, or UI-TARS for your Computer-Use Agent
- How to use Lume CLI for managing desktops
- Training Computer-Use Models: Collecting Human Trajectories with Cua (Part 1)
Module | Description | Installation |
---|---|---|
Lume | VM management for macOS/Linux using Apple's Virtualization.Framework | curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash |
Lumier | Docker interface for macOS and Linux VMs | docker pull trycua/lumier:latest |
Computer (Python) | Python Interface for controlling virtual machines | pip install "cua-computer[all]" |
Computer (Typescript) | Typescript Interface for controlling virtual machines | npm install @trycua/computer |
Agent | AI agent framework for automating tasks | pip install "cua-agent[all]" |
MCP Server | MCP server for using CUA with Claude Desktop | pip install cua-mcp-server |
SOM | Self-of-Mark library for Agent | pip install cua-som |
Computer Server | Server component for Computer | pip install cua-computer-server |
Core (Python) | Python Core utilities | pip install cua-core |
Core (Typescript) | Typescript Core utilities | npm install @trycua/core |
Join our Discord community to discuss ideas, get assistance, or share your demos!
Cua is open-sourced under the MIT License - see the LICENSE file for details.
Portions of this project, specifically components adapted from Kasm Technologies Inc., are also licensed under the MIT License. See libs/kasm/LICENSE for details.
Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0). See the OmniParser LICENSE for details.
Some optional extras for this project depend on third-party packages that are licensed under terms different from the MIT License.
- The optional "omni" extra (installed via
pip install "cua-agent[omni]"
) installs thecua-som
module, which includesultralytics
and is licensed under the AGPL-3.0.
When you choose to install and use such optional extras, your use, modification, and distribution of those third-party components are governed by their respective licenses (e.g., AGPL-3.0 for ultralytics
).
Cua uses bump2version
to manage package versions across all Python modules. A Makefile is provided to simplify the release process.
using brew
brew install bumpversion
make show-versions
To bump a specific package version:
# Patch version bump (e.g., 0.1.8 → 0.1.9)
make bump-patch-core # cua-core
make bump-patch-pylume # pylume
make bump-patch-computer # cua-computer
make bump-patch-som # cua-som
make bump-patch-agent # cua-agent
make bump-patch-computer-server # cua-computer-server
make bump-patch-mcp-server # cua-mcp-server
# Minor version bump (e.g., 0.1.8 → 0.2.0)
make bump-minor-core # Replace 'core' with any package name
# Major version bump (e.g., 0.1.8 → 1.0.0)
make bump-major-core # Replace 'core' with any package name
To preview changes without modifying files:
make dry-run-patch-core # Test patch bump for cua-core
make dry-run-minor-pylume # Test minor bump for pylume
make dry-run-major-agent # Test major bump for cua-agent
make bump-all-patch # Bumps patch version for ALL packages
After running any bump command, push your changes:
git push origin main && git push origin --tags
For more details, run make help
or see the Makefile.
We welcome contributions to Cua! Please refer to our Contributing Guidelines for details.
Apple, macOS, and Apple Silicon are trademarks of Apple Inc.
Ubuntu and Canonical are registered trademarks of Canonical Ltd.
Microsoft is a registered trademark of Microsoft Corporation.
This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., Microsoft Corporation, or Kasm Technologies.
Thank you to all our supporters!
Thank you to all our GitHub Sponsors!