Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
Stars
This project aims at building a scalable transactional stream processing engine on modern hardware. It allows ACID transactions to be run directly on streaming data. It shares similar project visio…
lzbench is an in-memory benchmark of open-source compressors
Zstandard - Fast real-time compression algorithm
New generation entropy codecs : Finite State Entropy and Huff0
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
Awesome LLM compression research papers and tools.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
The official GitHub page for the survey paper "A Survey of Large Language Models".
The definitive Web UI for local AI, with powerful features and easy setup.
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
FlashInfer: Kernel Library for LLM Serving
A time-series database for high-performance real-time analytics packaged as a Postgres extension
SpotServe: Serving Generative Large Language Models on Preemptible Instances
Hackable and optimized Transformers building blocks, supporting a composable construction.
An easy to use PyTorch to TensorRT converter
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support