Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View yynj26's full-sized avatar
  • Bay area, CA
  • 13:11 (UTC -07:00)

Block or report yynj26

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

算法工程师-机器学习面试题总结

1,608 216 Updated Sep 26, 2019

Simple, scalable AI model deployment on GPU clusters

Python 3,897 392 Updated Oct 23, 2025

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

3,345 342 Updated Jul 25, 2025

A curated list of awesome Recommender System (Books, Conferences, Researchers, Papers, Github Repositories, Useful Sites, Youtube Videos)

1,400 208 Updated Feb 13, 2022

Sources for the book "Machine Learning in Production"

CSS 135 36 Updated Jul 19, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,906 141 Updated Oct 23, 2025

⚡ Serverless Framework – Effortlessly build apps that auto-scale, incur zero costs when idle, and require minimal maintenance using AWS Lambda and other managed cloud services.

JavaScript 46,884 5,745 Updated Oct 22, 2025

TransMLA: Multi-Head Latent Attention Is All You Need (NeurIPS 2025 Spotlight)

Python 391 22 Updated Sep 23, 2025

LLM training in simple, raw C/CUDA

Cuda 27,942 3,247 Updated Jun 26, 2025

NanoGPT (124M) in 3 minutes

Python 3,677 472 Updated Oct 16, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 47,665 7,994 Updated Dec 9, 2024

Nano vLLM

Python 7,184 921 Updated Aug 31, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,351 224 Updated Oct 12, 2025

Machine Learning Engineering Open Book

Python 15,489 941 Updated Oct 21, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,919 205 Updated Oct 23, 2025

An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others.

JavaScript 23,156 2,250 Updated Oct 23, 2025

📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors

MDX 34,789 2,955 Updated Sep 24, 2025

Postman & Chatbot Arena for inference benchmarking.

Python 14 Updated Jun 19, 2025

Analyze computation-communication overlap in V3/R1.

1,110 143 Updated Mar 21, 2025

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,178 4,765 Updated Jun 2, 2025

Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.

Go 5,472 708 Updated Oct 23, 2025

Dria SDK is for building and executing synthetic data generation pipelines on Dria Knowledge Network.

Python 28 7 Updated Apr 3, 2025

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Dart 2,164 223 Updated Jul 28, 2025

LLM inference in C/C++

C++ 88,226 13,416 Updated Oct 23, 2025

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 38,311 2,306 Updated Oct 23, 2025

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,404 177 Updated Feb 24, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,320 276 Updated Jul 17, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 11,940 1,818 Updated Oct 23, 2025

GLM Series Edge Models

Python 150 12 Updated Jun 12, 2025
Next