Stars
pepsi7959 / ui-tars-desktop
Forked from bytedance/UI-TARS-desktopThe Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
An open-source OCR model for Thai, optimized for Thai document extraction.
🚀 Beautiful, fast and modern React UI library. (Previously NextUI)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Full System Prompt Transparency for All—that aggregates full system prompts, guidelines, and tools from major AI models like ChatGPT, Gemini, Claude, Mistral, Anthropic, xAI, Perplexity, and more. …
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Building a LINE Chatbot for user chatting with a PDF, Image, Video, and Audio files by Gemini and Cloud Functions for Firebase
Ghidra is a software reverse engineering (SRE) framework
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
Kaldi recipe to train commonvoice corpus in Thai language
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
A node-based image processing GUI aimed at making chaining image processing tasks easy and customizable. Born as an AI upscaling application, chaiNNer has grown into an extremely flexible and power…
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
Testsigma is an agentic test automation platform powered by AI-coworkers that work alongside QA teams to simplify testing, accelerate releases and improve quality across web, mobile, desktop, API, …
开源人脸口罩检测模型和数据 Detect faces and determine whether people are wearing mask.
DBFace is a real-time, single-stage detector for face detection, with faster speed and higher accuracy
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Open Source smart glasses designed to be 1. All day wearable 2. Immediately useful 3. Extendable for makers, startups, and everyone else.
Recognize captcha using deep learning ResNet model and TFLearn(1000个训练数据,经过短短几分钟的训练,正确率可以达到 99%)