Codestin Search App

webarena Public Forked from web-arena-x/webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

AI-Agentic-Tools-Benchmarking/webarena’s past year of commit activity

Python 0 Apache-2.0 210 0 0 Updated Nov 14, 2025
cuga-agent Public Forked from cuga-project/cuga-agent
CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware features.

AI-Agentic-Tools-Benchmarking/cuga-agent’s past year of commit activity

Python 0 88 0 0 Updated Nov 14, 2025
appworld Public Forked from StonyBrookNLP/appworld
🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resource Paper.

AI-Agentic-Tools-Benchmarking/appworld’s past year of commit activity

Python 0 Apache-2.0 51 0 0 Updated Nov 1, 2025
tau2-bench Public Forked from sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

AI-Agentic-Tools-Benchmarking/tau2-bench’s past year of commit activity

Python 0 MIT 129 0 0 Updated Oct 23, 2025
MobileAgent Public Forked from X-PLUG/MobileAgent
Mobile-Agent: The Powerful GUI Agent Family

AI-Agentic-Tools-Benchmarking/MobileAgent’s past year of commit activity

Python 0 MIT 721 0 0 Updated Sep 14, 2025
open-interpreter Public Forked from openinterpreter/open-interpreter
A natural language interface for computers

AI-Agentic-Tools-Benchmarking/open-interpreter’s past year of commit activity

Python 0 AGPL-3.0 5,398 0 0 Updated Aug 6, 2025
MaAS Public Forked from bingreeky/MaAS
[ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet

AI-Agentic-Tools-Benchmarking/MaAS’s past year of commit activity

Python 0 26 0 0 Updated Jun 10, 2025
ToolBench Public Forked from OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

AI-Agentic-Tools-Benchmarking/ToolBench’s past year of commit activity

Python 0 Apache-2.0 472 0 0 Updated May 21, 2025
super-benchmark Public Forked from allenai/super-benchmark

AI-Agentic-Tools-Benchmarking/super-benchmark’s past year of commit activity

Jupyter Notebook 0 Apache-2.0 4 0 0 Updated Apr 4, 2025
MathSensei Public Forked from rakutentech/MathSensei

AI-Agentic-Tools-Benchmarking/MathSensei’s past year of commit activity

Python 0 Apache-2.0 1 0 0 Updated Sep 29, 2024

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI-Agentic-Tools-Benchmarking

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!