Stars
《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》
AI agents can now use real Android and iOS apps, just like a human.
🚀 The fast, Pythonic way to build MCP servers and clients
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!
Model Context Protocol Server for Mobile Automation and Scraping (iOS, Android, Emulators, Simulators and Real Devices)
pure python adb library for google adb service.
gRPC proxy is a Go reverse proxy that allows for rich routing of gRPC calls with minimum overhead.
A full-scale testing platform for each stage of your development and operations lifecycle.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Open-source search and retrieval database for AI applications.
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
Write Model Context Protocol servers in few lines of go code. Docs at https://mcpgolang.com . Created by https://metoro.io
The official Python SDK for Model Context Protocol servers and clients
All-in-One Development Tool based on PaddlePaddle
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Janus-Series: Unified Multimodal Understanding and Generation Models
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
解决Cursor在免费订阅期间出现以下提示的问题: Your request has been blocked as our system has detected suspicious activity / You've reached your trial request limit. / Too many free trial accounts used on this machine.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
RTSP client and server library for the Go programming language
AlexSnet / go-vnc
Forked from madddi/go-vncVNC client and server library for Go.