Stars
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
A real-time human motion tracking and analysis system optimized for Apple Silicon (M4), designed for precise posture correction, fitness training, dance coaching, and interactive body-based applica…
PyTorch实现高分遥感语义分割(地物分类)
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
这是一个简单的技术科普教程项目,主要聚焦于解释一些有趣的,前沿的技术概念和原理。每篇文章都力求在 5 分钟内阅读完成。
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Solve Visual Understanding with Reinforced VLMs
Fully open reproduction of DeepSeek-R1
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
OCR, layout analysis, reading order, table recognition in 90+ languages
【间隙·树·排序算法】 对OCR结果或PDF提取的文本进行版面分析,按人类阅读顺序进行排序。
A Comprehensive Toolkit for High-Quality PDF Content Extraction
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Image composition toolbox: everything you want to know about image composition or object insertion
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
real time face swap and one-click video deepfake with only a single image
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
👾 Fast and simple video download library and CLI tool written in Go
Downloads videos and playlists from YouTube
🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.