Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ acu Public
forked from trycua/acu

A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.

Notifications You must be signed in to change notification settings

Mekcyed/acu

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

logo

X Community

ACU - Awesome Agents for Computer Use

An AI Agent for Computer Use is an autonomous program that can reason about tasks, plan sequences of actions, and act within the domain of a computer or mobile device in the form of clicks, keystrokes, other computer events, command-line operations and internal/external API calls. These agents combine perception, decision-making, and control capabilities to interact with digital interfaces and accomplish user-specified goals independently.

A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.

Table of Contents

Articles

Papers

Surveys

Frameworks & Models

UI Grounding

Dataset

Benchmark

Safety

Projects

Open Source

Frameworks & Models

  • AutoGen

    • Framework for building AI agent systems.
    • It simplifies the creation of event-driven, distributed, scalable, and resilient agentic applications.
  • Auto-GPT

    • Autonomous GPT-4 agent
    • Task automation focus
  • Browser Use

    • Make websites accessible for AI agents with vision + HTML extraction
    • Supports multi-tab management and custom actions with LangChain integration
  • Claude Computer Use Demo

    • MacOS implementation
    • Claude integration
  • Claude Minecraft Use

    • Game automation
    • Specialized use case
  • Computer Use OOTB

    • Ready-to-use implementation
    • Comprehensive toolset
  • Cybergod

    • Advanced computer control
  • Grunty

    • Computer control agent
    • Task automation focus
  • LaVague

    • AI web agent framework
    • Modular architecture
  • Mac Computer Use

    • MacOS-specific tools
    • Anthropic integration
  • NatBot

    • Browser automation
    • GPT-4 Vision integration
  • OpenAdapt

    • AI-First Process Automation
    • Multimodal model integration
  • OpenInterface

    • Open-source UI interaction framework
    • Cross-platform support
  • OpenInterpreter

    • General-purpose computer control framework
    • Python-based, extensible architecture
  • Open Source Computer Use by E2B

    • Open-source implementation of computer control capabilities
    • Secure sandboxed environment for AI agents
  • Self-Operating Computer

    • Computer control framework
    • Vision-based automation
  • Skyvern

    • AI web agent framework
    • Automate browser-based workflows with LLMs using vision and HTML extraction
  • Surfkit

    • Device operation toolkit
    • Extensible agent framework
  • WebMarker

    • Web page annotation tool
    • Vision-language model support

Environment & Sandbox

Automation

  • nut.js

    • Native UI automation
    • JavaScript/TypeScript implementation
  • PyAutoGUI

    • Cross-platform GUI automation
    • Python-based control library

Commercial

Frameworks & Models

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published