Stars
A safetensors extension to efficiently store sparse quantized tensors on disk
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM