Lists (1)
Sort Name ascending (A-Z)
Stars
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
CRS-自建Claude Code镜像,一站式开源中转服务,让 Claude、OpenAI、Gemini、Droid 订阅统一接入,支持拼车共享,更高效分摊成本,原生工具无缝使用。
Elevate your AI research writing, no more tedious polishing ✨
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
CLIP+MLP Aesthetic Score Predictor
Official repository for the paper "TIIF-Bench: How Does Your T2I Model Follow Your Instructions?".
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
GenEval: An object-focused framework for evaluating text-to-image alignment
[Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation
Model Compression Toolbox for Large Language Models and Diffusion Models
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
Open-Sora: Democratizing Efficient Video Production for All
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
A unified inference and post-training framework for accelerated video generation.
Enjoy the magic of Diffusion models!
Wan: Open and Advanced Large-Scale Video Generative Models
This is a Chinese translation of the CUDA programming guide
[Lumina具身智能社区] 具身智能技术指南 Embodied-AI-Guide
A one-of-a-kind resume builder that keeps your privacy in mind. Completely secure, customizable, portable, open-source and free forever. Try it out today!
[ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" and "SparseVLM+: Visual Token Sparsification with Improved Text-Vis…
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× vs cuBLAS
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration techniques.
"Paper2Slides: From Paper to Presentation in One Click"
Running VLA at 30Hz frame rate and 480Hz trajectory frequency