- Fudan University
- https://chenhsing.github.io/
Lists (3)
Sort Name ascending (A-Z)
Stars
We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…
4-steps distilled version of Wan2.2-TI2V-5B
Lumos Project: Frontier video unified model research by Alibaba DAMO Academy, including Lumos-1, etc.
Official respository for ReasonGen-R1
ComfyUI nodes for StableAnimator
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
MAGI-1: Autoregressive Video Generation at Scale
Lets make video diffusion practical!
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.
[ICCV 2025] CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
[ICCV 2025] MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Wan: Open and Advanced Large-Scale Video Generative Models
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
⏰ Collaboratively track worldwide conference deadlines (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a video editing benchmark code.
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (ICLR 2025 Oral)
Refine high-quality datasets and visual AI models
[ICCV2025] MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Official code of SmartEdit [CVPR-2024 Highlight]
Generative Models by Stability AI
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction