Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View zhuohan123's full-sized avatar

Organizations

@alpa-projects @vllm-project

Block or report zhuohan123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TPU inference for vLLM, with unified JAX and PyTorch support.

Python 131 19 Updated Oct 28, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,113 149 Updated Oct 28, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 791 58 Updated Oct 20, 2025

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

HCL 23 42 Updated Oct 28, 2025

๐Ÿค–ไธ€ไธชๅŸบไบŽ WeChaty ็ป“ๅˆ DeepSeek / ChatGPT / Kimi / ่ฎฏ้ฃž็ญ‰AiๆœๅŠกๅฎž็Žฐ็š„ๅพฎไฟกๆœบๅ™จไบบ ๏ผŒๅฏไปฅ็”จๆฅๅธฎๅŠฉไฝ ่‡ชๅŠจๅ›žๅคๅพฎไฟกๆถˆๆฏ๏ผŒๆˆ–่€…็ฎก็†ๅพฎไฟก็พค/ๅฅฝๅ‹๏ผŒๆฃ€ๆต‹ๅƒตๅฐธ็ฒ‰็ญ‰...

JavaScript 8,897 1,063 Updated Oct 24, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 3,937 220 Updated Aug 15, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,981 1,884 Updated Oct 23, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,911 142 Updated Oct 28, 2025

From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

Python 2,323 306 Updated Oct 17, 2025

kernels, of the mega variety

Python 591 26 Updated Sep 28, 2025

A PyTorch native platform for training generative AI models

Python 4,606 578 Updated Oct 28, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,747 282 Updated Oct 28, 2025

A program to read, merge, and write programs for the Breville Control ยฐFreakยฎ

Java 25 2 Updated Aug 19, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,329 25,678 Updated Oct 28, 2025

The Startup CTO's Handbook, a book covering leadership, management and technical topics for leaders of software engineering teams

13,863 773 Updated Jul 30, 2025
Python 179 6 Updated Aug 4, 2025

Load compute kernels from the Hub

Python 309 25 Updated Oct 27, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,422 955 Updated Oct 24, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,837 726 Updated Oct 15, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,928 285 Updated May 15, 2025

common in-memory tensor structure

C++ 1,088 154 Updated Oct 11, 2025

The best OSS video generation models, created by Genmo

Python 3,474 444 Updated Sep 5, 2025

Manipulating Python Programs

Python 694 31 Updated Oct 22, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,157 269 Updated Oct 27, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 909 44 Updated Oct 22, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 432 33 Updated May 30, 2025

A framework for few-shot evaluation of language models.

Python 10,465 2,808 Updated Oct 27, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,156 82 Updated Aug 28, 2025

Blazingly fast LLM inference.

Rust 6,174 464 Updated Oct 26, 2025
Next