-
chenxwh.github.io Public
Forked from alshedivat/al-folioA beautiful, simple, clean, and responsive Jekyll theme for academics
-
fairseq2 Public
Forked from facebookresearch/fairseq2FAIR Sequence Modeling Toolkit 2
Python MIT License UpdatedJul 22, 2025 -
OminiControl Public
Forked from Yuanshi9815/OminiControlA minimal and universal controller for FLUX.1.
-
-
DeepSeek-VL2 Public
Forked from deepseek-ai/DeepSeek-VL2DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
-
NOVA Public
Forked from baaivision/NOVANOVA: Autoregressive Video Generation without Vector Quantization
-
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
-
echomimic Public
Forked from antgroup/echomimicEchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Python Apache License 2.0 UpdatedDec 10, 2024 -
Florence-VL Public
Forked from JiuhaiChen/CVPR2025-Florence-VL -
Sana Public
Forked from NVlabs/SanaSANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
-
LTX-Video Public
Forked from Lightricks/LTX-VideoOfficial repository for LTX-Video
Python Other UpdatedNov 24, 2024 -
OmniParser Public
Forked from microsoft/OmniParserA simple screen parsing tool towards pure vision based GUI agent
-
hart Public
Forked from mit-han-lab/hartHART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Python MIT License UpdatedOct 19, 2024 -
CogView3 Public
Forked from zai-org/CogView4text to image to generation: CogView3-Plus and CogView3(ECCV 2024)
Python Apache License 2.0 UpdatedOct 14, 2024 -
ml-depth-pro Public
Forked from apple/ml-depth-proDepth Pro: Sharp Monocular Metric Depth in Less Than a Second.
-
Lotus Public
Forked from EnVision-Research/LotusOfficial Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
-
DepthCrafter Public
Forked from Tencent/DepthCrafterDepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Python Other UpdatedOct 1, 2024 -
CogVLM2 Public
Forked from zai-org/CogVLM2GPT4V-level open-source multi-modal model based on Llama3-8B
-
CogVideo Public
Forked from zai-org/CogVideoText-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
-
DiffSynth-Studio Public
Forked from modelscope/DiffSynth-StudioEnjoy the magic of Diffusion models!
-
Depth-Anything-V2 Public
Forked from DepthAnything/Depth-Anything-V2Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
-
Omost Public
Forked from lllyasviel/OmostYour image is almost there!
-
SadTalker Public
Forked from OpenTalker/SadTalker(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
-
OpenVoice Public
Forked from myshell-ai/OpenVoiceInstant voice cloning by MyShell.
-
PixArt-sigma Public
Forked from PixArt-alpha/PixArt-sigmaPixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
-
Kandinsky-2 Public
Forked from ai-forever/Kandinsky-2Kandinsky 2 — multilingual text2image latent diffusion model
-
AniPortrait Public
Forked from Zejun-Yang/AniPortraitAniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
-
video-retalking Public
Forked from OpenTalker/video-retalking[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
-
MeloTTS Public
Forked from myshell-ai/MeloTTSHigh-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
-
SUPIR Public
Forked from Fanghua-Yu/SUPIRSUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild