Stars
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
An interconnect topology detection tool for Azure VMs
A General-purpose Task-parallel Programming System using Modern C++
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Thin, unified, C++-flavored wrappers for the CUDA APIs
A validation and profiling tool for AI infrastructure
Smart pointers for the (GNU) C programming language
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
An unofficial cuda assembler, for all generations of SASS, hopefully :)
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
A dynamic array implementation in C similar to the one found in standard C++
cloc counts blank lines, comment lines, and physical lines of source code in many programming languages.
Easy to use, modular, header only, macro based, generic and type-safe Data Structures in C
M*LIB is a library of generic and type safe containers / data structures in pure C language (C99 / C11) for a wide collection of container (comparable to the C++ STL).
LLVM (Low Level Virtual Machine) Guide. Learn all about the compiler infrastructure, which is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs. Originally im…
GPGPU processor supporting RISCV-V extension, developed with Chisel HDL
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。