Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View kangguangli's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kangguangli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,481 582 Updated Feb 15, 2025

油猴脚本:知乎备份剪藏,将你喜欢的回答/文章/想法保存为 markdown / zip / png

TypeScript 115 19 Updated Oct 8, 2025
Python 537 80 Updated Nov 3, 2025

悟数学

TeX 194 42 Updated Jul 27, 2025

Puzzles for learning Triton

Jupyter Notebook 2,098 171 Updated Nov 18, 2024

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,318 823 Updated Oct 17, 2025

[SIGGRAPH 2025] Official code of the paper "Cobra: Efficient Line Art COlorization with BRoAder References". Cobra:利用更广泛参考图实现高效线稿上色

Python 225 16 Updated Apr 17, 2025

sketch + style = paints 🎨 (TOG2018/SIGGRAPH2018ASIA)

JavaScript 18,213 2,094 Updated Aug 1, 2023

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,215 420 Updated Nov 5, 2025

Supercharge Your LLM with the Fastest KV Cache Layer

Python 5,901 690 Updated Nov 5, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 910 44 Updated Oct 29, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,042 1,839 Updated Nov 5, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 9,980 1,664 Updated Nov 5, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,230 615 Updated Nov 5, 2025

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 10,705 1,092 Updated Apr 30, 2025

A massively parallel, high-level programming language

Rust 19,070 466 Updated Jun 3, 2025

Distribute and run LLMs with a single file.

C 23,324 1,234 Updated Nov 5, 2025

NCCL Profiling Kit

Python 145 11 Updated Jul 1, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,608 4,613 Updated Nov 5, 2025

Ongoing research training transformer models at scale

Python 14,088 3,244 Updated Nov 5, 2025

A latent text-to-image diffusion model

Jupyter Notebook 71,754 10,517 Updated Jun 18, 2024

Xray、Tuic、hysteria2、sing-box 八合一一键脚本

Shell 17,538 5,149 Updated Nov 3, 2025

🏂🏻 程序员海外工作/英文面试手册

4,767 345 Updated Feb 25, 2024

A language about virtual kontinuation

Mathematica 26 Updated Dec 14, 2024

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 12,789 3,690 Updated Nov 5, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 33,894 3,230 Updated Nov 5, 2025

System for AI Education Resource.

Python 4,159 522 Updated Oct 25, 2024

Jupyter kernel for the C++ programming language

C++ 3,242 316 Updated Oct 27, 2025

compiler learning resources collect.

Python 2,576 358 Updated Mar 19, 2025
Next