-
SGLang | AMD | Tsinghua University
- California, USA
-
01:27
(UTC -08:00) - https://yushengsu-thu.github.io/
- @thu_yushengsu
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Claude Opus 4.6 wrote a dependency-free C compiler in Rust, with backends targeting x86 (64- and 32-bit), ARM, and RISC-V, capable of compiling a booting Linux kernel.
Tutorials for Triton, a language for writing gpu kernels
HuggingFace conversion and training library for Megatron-based models
Training library for Megatron-based models with bidirectional Hugging Face conversion capability
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Getting Started with Triton: A Tutorial for Python Beginners
A simple, performant and scalable Jax LLM!
The absolute trainer to light up AI agents.
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation feedback, cross-platform NVIDIA/AMD, Kernelbook + KernelBench
open-source coding LLM for software engineering tasks
yushengsu-thu / sglang
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM training.
⛽️「算法通关手册」:从零开始的「算法与数据结构」学习教程,200 道「算法面试热门题目」,1000+ 道「LeetCode 题目解析」,持续更新中!
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
A collection of 500+ real-world ML & LLM system design case studies from 100+ companies. Learn how top tech firms implement GenAI in production.
A Next-Generation Training Engine Built for Ultra-Large MoE Models
I recently interviewed with some AI labs and these are the notes I took during my study for ML fundamentals and Design. This was in Mar 2025 and given how fast the field of AI moves, some of it may…
Research prototype of PRISM — a cost-efficient multi-LLM serving system with flexible time- and space-based GPU sharing.