Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Large Language Models.

Python 69 12 Updated Oct 1, 2025

google / benchmark

A microbenchmark support library

C++ 9,799 1,712 Updated Oct 29, 2025

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 757 90 Updated Oct 21, 2025

gau-nernst / learn-cuda

Learn CUDA with PyTorch

Cuda 96 13 Updated Sep 24, 2025

meta-pytorch / tritonbench

Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

Python 272 49 Updated Nov 1, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,234 454 Updated Oct 27, 2025

apptainer / apptainer

Apptainer: Application containers for Linux

Go 1,606 160 Updated Oct 31, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,623 590 Updated Nov 1, 2025

aliyun / SimAI

C++ 707 121 Updated Oct 29, 2025

QDelta / Phantora

A hybrid GPU cluster simulator for ML system performance estimation

Rust 7 Updated Oct 9, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,260 1,100 Updated Oct 31, 2025

jj-vcs / jj

A Git-compatible VCS that is both simple and powerful

Rust 21,838 772 Updated Oct 31, 2025

AlDanial / cloc

cloc counts blank lines, comment lines, and physical lines of source code in many programming languages.

Perl 21,933 1,077 Updated Oct 31, 2025

agentscope-ai / agentscope

AgentScope: Agent-Oriented Programming for Building LLM Applications

Python 13,487 1,089 Updated Oct 31, 2025

aerdem4 / lofo-importance

Leave One Feature Out Importance

Python 847 86 Updated Feb 14, 2025

SakanaAI / robust-kbench

Python 50 4 Updated Sep 19, 2025

zoe-yyx / CapaBench

Capabench：A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents

Python 7 Updated May 16, 2025

baidubce / Qianfan-VL

Qianfan-VL: Domain-Enhanced Universal Vision-Language Models

166 13 Updated Sep 22, 2025

Meinersbur / ppcg

Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)

C 131 36 Updated Jul 22, 2022

NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,245 175 Updated Aug 19, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,186 415 Updated Oct 31, 2025

MoonshotAI / checkpoint-engine

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 798 60 Updated Oct 31, 2025

openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 3,644 673 Updated Nov 1, 2025

netnr / kms

KMS 激活服务，slmgr 命令激活 Windows 系统、Office

HTML 2,657 431 Updated Oct 5, 2025

yuxuan-z yuxuan-z19

Highlights

Lists (3)

learnhelper

life

pacman

Starred repositories

kubernetes-scheduler