Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
Benchmarking Deep Learning operations on different hardware
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
collection of benchmarks to measure basic GPU capabilities
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Large Language Models.
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
A PyTorch native platform for training generative AI models
A hybrid GPU cluster simulator for ML system performance estimation
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
A Git-compatible VCS that is both simple and powerful
cloc counts blank lines, comment lines, and physical lines of source code in many programming languages.
AgentScope: Agent-Oriented Programming for Building LLM Applications
Capabench:A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents
Qianfan-VL: Domain-Enhanced Universal Vision-Language Models
Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
A machine learning compiler for GPUs, CPUs, and ML accelerators