Stars
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
MAGI-1: Autoregressive Video Generation at Scale
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation
A large-scale information-rich web dataset, featuring millions of real clicked query-document labels
scalable and robust tree-based speculative decoding algorithm
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
SkyAGI: Emerging human-behavior simulation capability in LLM
Instruct-tune LLaMA on consumer hardware
Code and documentation to train Stanford's Alpaca models, and generate the data.
đ Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
DSPy: The framework for programmingânot promptingâlanguage models
Running large language models on a single GPU for throughput-oriented scenarios.
Exploring finetuning public checkpoints on filter 8K sequences on Pile
A framework for few-shot evaluation of language models.
User-friendly secure computation engine based on secure multi-party computation
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Reformer, the efficient Transformer, in Pytorch
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
Pytorch implementation of the image transformer for unconditional image generation
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Implementations of several fast approximate algorithms for geometric optimal transport (OT)