Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View LightersWang's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report LightersWang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Contexts Optical Compression

Python 15,168 901 Updated Oct 23, 2025

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,036 160 Updated Oct 17, 2025

The best ChatGPT that $100 can buy.

Python 31,660 3,398 Updated Oct 22, 2025

A lightweight, powerful framework for multi-agent workflows

Python 16,785 2,754 Updated Oct 23, 2025
Python 10 Updated Sep 29, 2025

A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.

Python 11,988 1,372 Updated Aug 20, 2024

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 71,685 8,488 Updated Oct 23, 2025

📚 Benchmark your browser agent on ~2.5k READ and ACTION based tasks

69 4 Updated Jul 29, 2025

The official GitHub repository for the paper "GA: A Comprehensive Survey on LLM-based GUI Agent"

3 Updated Aug 28, 2024
Python 4 Updated Jul 31, 2025

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,727 150 Updated Oct 9, 2025
Python 76 11 Updated Oct 13, 2025

Qianfan-VL: Domain-Enhanced Universal Vision-Language Models

163 11 Updated Sep 22, 2025

Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"

Python 936 104 Updated Mar 4, 2024

Mobile-Agent: The Powerful GUI Agent Family

Python 6,104 608 Updated Oct 17, 2025

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript 1,055 101 Updated Dec 9, 2024

Prompt, run, edit, and deploy full-stack web applications. -- bolt.new -- Help Center: https://support.bolt.new/ -- Community Support: https://discord.com/invite/stackblitz

TypeScript 15,853 14,221 Updated Dec 17, 2024
Python 27 2 Updated Aug 31, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 12,486 1,192 Updated Oct 12, 2025

The model, data and code for the visual GUI Agent SeeClick

HTML 433 22 Updated Jul 13, 2025

ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K question-answer pairs collected by human annotators for ~35K…

Python 129 9 Updated Feb 7, 2025

This repository hosts a collection of datasets for training and evaluating CUA / GUI agents.

72 5 Updated Jul 27, 2025

UI-Venus is a native UI agent designed to perform precise GUI element grounding and effective navigation using only screenshots as input.

Python 492 36 Updated Aug 25, 2025

[ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 episodes from 6 mobile devices, spanning 6 types of cross-app…

Python 130 8 Updated Aug 4, 2025

The code and data of We-Math 2.0.

Python 159 8 Updated Aug 30, 2025

Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

Python 180 7 Updated Nov 4, 2024

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 7,895 518 Updated Oct 22, 2025

Automated generation of planar geometry olympiad problems

C# 98 21 Updated Aug 11, 2023
Next