Lists (1)
Sort Name ascending (A-Z)
Stars
lihaoyun6 / FlashVSR_plus
Forked from OpenImagingLab/FlashVSRTowards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de…
A beginner-friendly inference to finetune & run inference on open TTS models 🗣️
Krea Realtime 14B. An open-source realtime AI video model.
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation (ICCV 2025)
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de…
revolutionary new technology that turns any image into obama
Custom ComfyUI nodes for film grain, color matching, and video enhancement.
VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)
A tool to prepare datasets for training VibeVoice
ComfyUI nodes to use segment-anything-2
ComfyUI nodes for WanAnimate model input preprocessing
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
Unofficial WIP LoRa Finetuning repository for VibeVoice
JimmyMa99 / train-higgs-audio
Forked from boson-ai/higgs-audioText-audio foundation model from Boson AI
For People who want to contribute to my site
ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation
Fast and local neural text-to-speech engine
Long-form streaming TTS system for multi-speaker dialogue generation
Powerful & Easy-to-Use Video Face Swapping and Editing Software
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.