This repository documents my GPU learning journey. It gathers the different experiments, kernels, and reimplementations I work on as I explore GPU programming more broadly.. I will update it progressively as I learn
-
gpu-puzzles/ - implementations of the GPU Puzzles in CUDA C++ (https://github.com/srush/GPU-Puzzles)
-
flash-attention/ - FlashAttention v1 reimplementation in CUDA + a naive GPU baseline for comparison