Stars
A modern download manager that supports all platforms. Built with Golang and Flutter.
Comfyui LLM OpenAI API compatible plugin
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The successful integration of Qwen3-VL-Instruct series into the ComfyUI platform has enabled a smooth operation, supporting (but not limited to) text-based queries, video queries, single-image quer…
A feature-rich command-line audio/video downloader
Downloads videos and playlists from YouTube
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching
Enhancements & experiments for ComfyUI, mostly focusing on UI features
Official front-end implementation of ComfyUI
A ComfyUI custom node for Google's Gemini 2.5 Flash Image (aka "Nano Banana") model - the state-of-the-art image generation and editing AI that went viral for its incredible quality and capabilities.
New nANO-Banana Google Gemini API for ComfyUI generate images, transcribe audio, sumarize videos. Making a separate implemetation of my old IF_AI tools for easy installation
Custom nodes for ComfyUI that utilize a language model to generate text-to-image prompts
LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfac…
ComfyUI nodes collection: better TAESD previews (including batch previews), improved HyperTile and Deep Shrink nodes
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio
Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)
[ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
DiffuEraser is a diffusion model for video Inpainting, you can use it in ComfyUI
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.