Lists (26)
Sort Name ascending (A-Z)
Books
CUDA
DND
Edge
Edge AI
Elixir
Index / Survey / Whatever
Inference
Inference Engines
JAX
Malachite
ML Compilers
MLSys
Netherdeep
Network Science
Performance Optimization
Profiling
Quantization
Recommender Systems
Reinforcement Learning
Research
Stuff
Tools
Triton
Video Diffusion
Visualization
Starred repositories
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Fast and accurate automatic speech recognition (ASR) for edge devices
Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
Repository hosting code to reproduce our paper (with Stanford and TogetherAI), "Making Databases Faster with LLM Evolutionary Sampling"
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
What LLMs Think When You Don’t Tell Them What to Think About?
This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".
Collection of high-quality robo learning papers for bipedal robots.
GPU matmul kernel exploration: Triton, cuTile, and TileIR backend benchmarks on Blackwell
TRUST is a thermohydraulic software package for CFD simulations. It was originally designed for incompressible single-phase and Low Mach Number flows, but now also allows simulating real compressib…
A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)
🔥 Clone and recreate any website as a modern React app in seconds
Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
🌎 Regional-to-global coupled ocean and sea ice simulations based on Oceananigans
Google TPU optimizations for transformers models
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels
Review automated kernel generation in the era of LLMs
Our first fully AI generated deep learning system
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Algorithm powering the For You feed on X