🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

3,370 344 Updated Jul 25, 2025

jihoo-kim / awesome-RecSys

A curated list of awesome Recommender System (Books, Conferences, Researchers, Papers, Github Repositories, Useful Sites, Youtube Videos)

1,402 208 Updated Feb 13, 2022

mlip-cmu / book

Sources for the book "Machine Learning in Production"

CSS 136 37 Updated Jul 19, 2025

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,933 148 Updated Nov 4, 2025

serverless / serverless

⚡ Serverless Framework – Effortlessly build apps that auto-scale, incur zero costs when idle, and require minimal maintenance using AWS Lambda and other managed cloud services.

JavaScript 46,890 5,744 Updated Oct 22, 2025

MuLabPKU / TransMLA

TransMLA: Multi-Head Latent Attention Is All You Need (NeurIPS 2025 Spotlight)

Python 396 22 Updated Sep 23, 2025

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,043 3,259 Updated Jun 26, 2025

KellerJordan / modded-nanogpt

NanoGPT (124M) in 3 minutes

Python 3,764 488 Updated Oct 29, 2025

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 48,902 8,185 Updated Dec 9, 2024

stanford-cs336 / spring2025-lectures

Python 1,982 421 Updated Oct 28, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 8,142 1,004 Updated Nov 3, 2025

skyzh / tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,383 227 Updated Nov 2, 2025

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 15,597 956 Updated Oct 27, 2025

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,968 217 Updated Nov 4, 2025

eyaltoledano / claude-task-master

An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others.

JavaScript 23,454 2,285 Updated Nov 4, 2025

PatrickJS / awesome-cursorrules

📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors

MDX 35,118 2,979 Updated Oct 24, 2025

Inference-Engine-Arena / inference-engine-arena

Postman & Chatbot Arena for inference benchmarking.

Python 14 Updated Jun 19, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

1,112 143 Updated Mar 21, 2025

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,211 4,766 Updated Jun 2, 2025

NexaAI / nexa-sdk

Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.

Go 5,592 721 Updated Nov 4, 2025

firstbatchxyz / dria-sdk

Dria SDK is for building and executing synthetic data generation pipelines on Dria Knowledge Network.

Python 29 7 Updated Apr 3, 2025

Mobile-Artificial-Intelligence / maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Dart 2,187 224 Updated Jul 28, 2025

ggml-org / llama.cpp

LLM inference in C/C++

C++ 88,723 13,513 Updated Nov 4, 2025

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 39,113 2,361 Updated Nov 4, 2025

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,409 177 Updated Feb 24, 2025

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,331 277 Updated Jul 17, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,031 1,834 Updated Nov 4, 2025

zai-org / GLM-Edge

GLM Series Edge Models

Python 152 14 Updated Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yue Yang yynj26

Achievements