Stars
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-throughput and memory-efficient inference and serving engine for LLMs
Community maintained hardware plugin for vLLM on Ascend
Visualizer for neural network, deep learning and machine learning models
A debugging and profiling tool that can trace and visualize python code execution
Github mirror of trition-lang/triton repo.
Development repository for the Triton language and compiler
verl: Volcano Engine Reinforcement Learning for LLMs
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
Tile primitives for speedy kernels
Fast and memory-efficient exact attention
Virtual whiteboard for sketching hand-drawn like diagrams
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
The IBM/charts repository provides helm charts for IBM and Third Party middleware.
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
FlashInfer: Kernel Library for LLM Serving
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
🚀 Efficient implementations of state-of-the-art linear attention models
TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels
The new Windows Terminal and the original Windows console host, all in the same place!
Apache Spark - A unified analytics engine for large-scale data processing
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
Train transformer language models with reinforcement learning.
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown