-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
MLIR MIT License UpdatedOct 23, 2025 -
LightLLM Public
Forked from ModelTC/LightLLMLightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Python Apache License 2.0 UpdatedOct 22, 2025 -
-
generative-ai-for-beginners Public
Forked from microsoft/generative-ai-for-beginners21 Lessons, Get Started Building with Generative AI
Jupyter Notebook MIT License UpdatedOct 13, 2025 -
TIS Public
Forked from triton-inference-server/serverThe Triton Inference Server provides an optimized cloud and edge inferencing solution.
Python BSD 3-Clause "New" or "Revised" License UpdatedOct 8, 2025 -
ml-systems-papers Public
Forked from byungsoo-oh/ml-systems-papersCurated collection of papers in machine learning systems
UpdatedOct 4, 2025 -
-
FlagGems Public
Forked from FlagOpen/FlagGemsFlagGems is an operator library for large language models implemented in the Triton Language.
Python Apache License 2.0 UpdatedSep 17, 2025 -
llm_note Public
Forked from harleyszhang/llm_noteLLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
Python UpdatedSep 16, 2025 -
-
-
Triton-Puzzles Public
Forked from srush/Triton-PuzzlesPuzzles for learning Triton
Jupyter Notebook Apache License 2.0 UpdatedSep 11, 2025 -
leetgpu-challenges Public
Forked from dsl-learn/leetgpu-challengesLeetGPU Challenges
-
DAX Public
Forked from RiseAI-Sys/DAXHigh performance inference engine for diffusion models
Python Apache License 2.0 UpdatedSep 5, 2025 -
LeetCUDA Public
Forked from xlite-dev/LeetCUDA📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Cuda GNU General Public License v3.0 UpdatedSep 3, 2025 -
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedAug 31, 2025 -
attention-gym Public
Forked from RiseAI-Sys/attention-gymTriton based sparse quantization attention kernel collection
Python Apache License 2.0 UpdatedAug 29, 2025 -
hello-algo Public
Forked from krahets/hello-algo《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
Java Other UpdatedAug 27, 2025 -
ParaVAE Public
Forked from RiseAI-Sys/ParaVAEDistributed parallel 3D-Causal-VAE for efficient training and inference
Python Apache License 2.0 UpdatedAug 20, 2025 -
self-llm Public
Forked from datawhalechina/self-llm《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Jupyter Notebook Apache License 2.0 UpdatedAug 13, 2025 -
lite_llama Public
Forked from harleyszhang/lite_llamaA light llama-like llm inference framework based on the triton kernel.
Python UpdatedAug 7, 2025 -
learning-in-opencamp Public template
Forked from opencamp-cn/learning-in-opencampOpenCamp 训练营通用学习工具
Rust MIT License UpdatedAug 6, 2025 -
-
-
Learning-CUDA Public
Forked from InfiniTensor/Learning-CUDA2025夏季训练营CUDA方向项目
Makefile UpdatedJul 31, 2025 -
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedJul 29, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedJul 29, 2025 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
C++ Apache License 2.0 UpdatedJul 29, 2025 -