为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 69,467 8,384 Updated Sep 20, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,929 285 Updated May 15, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,431 957 Updated Oct 24, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,671 971 Updated Oct 30, 2025

llvm / torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,663 608 Updated Oct 31, 2025

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 18,899 2,397 Updated Aug 6, 2024

ggml-org / llama.cpp

LLM inference in C/C++

C++ 88,554 13,473 Updated Nov 1, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,841 896 Updated Sep 30, 2025

mryab / efficient-dl-systems

Efficient Deep Learning Systems course materials (HSE, YSDA)

Jupyter Notebook 916 139 Updated Apr 23, 2025

Tony-Tan / CUDA_Freshman

Cuda 2,598 493 Updated Jan 16, 2024

Erkaman / Awesome-CUDA

This is a list of useful libraries and resources for CUDA development.

588 46 Updated Oct 8, 2017

PaddleJitLab / CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 760 75 Updated Jun 30, 2025

NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,983 764 Updated Feb 8, 2024

HazyResearch / aisys-building-blocks

Building blocks for foundation models.

567 28 Updated Jan 3, 2024

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,705 1,506 Updated Oct 29, 2025

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 11,597 888 Updated Sep 1, 2024

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,652 317 Updated Aug 19, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,258 820 Updated Oct 17, 2025

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 9,962 1,659 Updated Oct 31, 2025

mikeroyal / LLVM-Guide

LLVM (Low Level Virtual Machine) Guide. Learn all about the compiler infrastructure, which is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs. Originally im…

C++ 186 10 Updated Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yangelvin

Block or report yangelvin

Stars

remysys / ccint

krulis-martin / cuda-kmeans

cachelot / cachelot

srush / Triton-Puzzles

jiahaoli57 / Call-for-Reviewers

wandb / wandb

unslothai / unsloth

huggingface / trl

deepspeedai / DeepSpeed

open-thoughts / open-thoughts

binary-husky / gpt_academic