-
Nanjing University
- Nanjing, China
- https://z-jiaming.github.io/
Highlights
- Pro
Starred repositories
[ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
VideoNSA: Native Sparse Attention Scales Video Understanding
Official Repo for Self-Forcing++ High Quality Long Video Generation
LongLive: Real-time Interactive Long Video Generation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Tracking the latest and greatest research papers on video generation.
GoatWu / Self-Forcing-Plus
Forked from guandeh17/Self-ForcingUnofficial extension implementation of Self-Forcing to support I2V && 14B training.
4-steps distilled version of Wan2.2-TI2V-5B
A collection of paper/projects that trains flow matching model/policies via RL.
Pusa: Thousands Timesteps Video Diffusion Model
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A unified inference and post-training framework for accelerated video generation.
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
SkyReels-V2: Infinite-length Film Generative model
Use Claude Code or Cursor CLI on mobile and web with Claude Code UI. Claude Code UI free open source webui/GUI that helps you manage your Claude Code session and projects remotely
An open-source AI agent that brings the power of Gemini directly into your terminal.
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model
Reference PyTorch implementation and models for DINOv3
Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.
[Siggraph '23] NeRSemble: Neural Radiance Field Reconstruction of Human Heads