Highlights
- Pro
Stars
Reinforcement Learning via Self-Distillation (SDPO)
2026 Gemini 3 Competition Submission
π° Must-read papers on KV Cache Compression (constantly updating π€).
β° AI conference deadline countdowns
Hackable and optimized Transformers building blocks, supporting a composable construction.
Official Implementation of the ARMOR pruning algorithm [ICLR 2026]
Official JAX implementation of End-to-End Test-Time Training for Long Context
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense fβ¦
Accessible large language models via k-bit quantization for PyTorch.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Gemma open-weight LLM library, from Google DeepMind
A open-source guide that demystifies how U.S. universities evaluate and admit students into Computer Science PhD programs.
[ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Post-training with Tinker
FNSPID: A Comprehensive Financial News Dataset in Time Series
a curated list of high-quality papers on resource-efficient LLMs π±
Fully local web research and report writing assistant
[COLM 2025] Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources
Pretraining and inference code for a large-scale depth-recurrent language model