Stars
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
ComfyUI utility nodes for Z-Image model. Features LLM-powered prompt enhancement using the official Z-Image system prompt.
Character Select Stand Alone App with AI prompt and ComfyUI/WebUI API support for wai-il model
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for c…
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding b…
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
Wan: Open and Advanced Large-Scale Video Generative Models
A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…
A model-driven approach to building AI agents in just a few lines of code.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Amazon Bedrock Agentcore accelerates AI agents into production with the scale, reliability, and security, critical to real-world deployment.
The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usable locally.
[CVPR 2025] Learning Flow Fields in Attention for Controllable Person Image Generation
HandFixer,一键手部修复工作流,ComfyUI, Hand reapair
a machine learning image inpainting task that instinctively removes watermarks from image indistinguishable from the ground truth image
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
Bring projects, wikis, and teams together with AI. AppFlowy is the AI collaborative workspace where you achieve more without losing control of your data. The leading open source Notion alternative.
A docker free offline version for HeyGem; Python and Linux is all you need!