| Paper | Huggingface | App |
English | 中文
MobiAgent is a powerful and customizable mobile agent system including:
- An agent model family: MobiMind
- An agent acceleration framework: AgentRR
- An agent benchmark: MobiFlow
System Architecture:
[2025.8.30]🔥 We've open-sourced the MobiAgent!
Mobile App Demo:
MobiAgent_Demo.mp4
AgentRR Demo (Left: first task; Right: subsequent task)
AgentRR.mp4
agent_rr/- Agent Record & Replay frameworkcollect/- Data collection, annotation, processing and export toolsrunner/- Agent executor that connects to phone via ADB, executes tasks, and records execution tracesMobiFlow/- Agent evaluation benchmark based on milestone DAGapp/- MobiAgent Android appdeployment/- Service deployment for MobiAgent mobile application
If you would like to try MobiAgent directly with our APP, please download it in Download Link and enjoy yourself!
If you would like to try MobiAgent with python scripts which leverage Android Debug Bridge (ADB) to control your phone, please follow these steps:
conda create -n MobiMind python=3.10
conda activate MobiMind
pip install -r requirements.txt
# Download OmniParser model weights
for f in icon_detect/{train_args.yaml,model.pt,model.yaml} ; do huggingface-cli download microsoft/OmniParser-v2.0 "$f" --local-dir weights; done
# If you need GPU acceleration for OCR, install paddlepaddle-gpu according to your CUDA version
# For details, refer to https://www.paddlepaddle.org.cn/install/quick, for example CUDA 11.8:
python -m pip install paddlepaddle-gpu==3.1.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
- Download and install ADBKeyboard on your Android device
- Enable Developer Options on your Android device and allow USB debugging
- Connect your phone to the computer using a USB cable
After downloading the decider, grounder, and planner models, use vLLM to deploy model inference services:
vllm serve IPADS-SAI/MobiMind-Decider-7B --port <decider port>
vllm serve IPADS-SAI/MobiMind-Grounder-3B --port <grounder port>
vllm serve Qwen/Qwen3-4B-Instruct --port <planner port>Write the list of tasks that you would like to test in runner/mobiagent/task.json, then launch agent runner:
python -m runner.mobiagent.mobiagent --service_ip <Service IP> --decider_port <Decider Service Port> --grounder_port <Grounder Service Port> --planner_port <Planner Service Port>Parameters:
--service_ip: Service IP (default:localhost)--decider_port: Decider service port (default:8000)--grounder_port: Grounder service port (default:8001)--planner_port: Planner service port (default:8002)
The runner automatically controls the device and invoke agent models to complete the pre-defined tasks.
For detailed usage instructions, see the README.md files in each sub-module directory.
If you find MobiAgent useful in your research, please feel free to cite our paper:
@misc{zhang2025mobiagentsystematicframeworkcustomizable,
title={MobiAgent: A Systematic Framework for Customizable Mobile Agents},
author={Cheng Zhang and Erhu Feng and Xi Zhao and Yisheng Zhao and Wangbo Gong and Jiahui Sun and Dong Du and Zhichao Hua and Yubin Xia and Haibo Chen},
year={2025},
eprint={2509.00531},
archivePrefix={arXiv},
primaryClass={cs.MA},
url={https://arxiv.org/abs/2509.00531},
}
We gratefully acknowledge the open-source projects like MobileAgent, UI-TARS, and Qwen-VL, etc. We also thank the National Innovation Institute of High-end Smart Appliances for their support of this project.