Stars
​​Unlimited-length talking video generation​​ that supports image-to-video and video-to-video generation
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Generalizable Perception Stack for all things 3D, 4D & Scene Understanding
[NeurIPS'25] Official repository of Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
MiniMax-M2, a Mini model built for Max coding & agentic workflows.
official repo for ArtiLatent (siggraph asia 2025)
Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"
Official implementation for "DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion".
A linear estimator on top of clip to predict the aesthetic quality of pictures
A python module to repair invalid JSON from LLMs
Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
"RAG-Anything: All-in-One RAG Framework"
NVIDIA Linux open GPU with P2P support
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Official implementation of the paper "HRAvatar: High-Quality and Relightable Gaussian Head Avatar" [CVPR 2025]
(NeurIPS 2025) Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
An extremely fast Python package and project manager, written in Rust.
torchcomms: a modern PyTorch communications API
Dexbotic: Open-Source Vision-Language-Action Toolbox
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation