Starred repositories
local version for OmniAICreator/Anime-Llasa-3B-Captions-Demo
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
A Deep Learning Approach for Password Guessing (https://arxiv.org/abs/1709.00440)
SOTAMak1r / Infinite-Forcing
Forked from guandeh17/Self-ForcingInfinite-Forcing: Towards Infinite-Long Video Generation
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
High-Quality Text-to-Video Generation with Alpha Channel
Lynx: Towards High-Fidelity Personalized Video Generation
EasyLlasa は 5~15秒の日本語音声と日本語テキストから日本語音声を生成する TSTS (TextSpeechToSpeech) です。
to server only
Enable true multi gpu capability in Comfy UI using XDiT XFuser and FSDP
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models
ComfyUI nodes for WanAnimate model input preprocessing
Cloud replacement for vacuum robots enabling local-only operation
MiMo-Audio: Audio Language Models are Few-Shot Learners
Unofficial WIP LoRa Finetuning repository for VibeVoice
VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)
Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.
Long-form streaming TTS system for multi-speaker dialogue generation
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning