Kaweees

Miguel Villa Floran Kaweees

Curious, Creative, and Clever. CPE @calpoly, with a focus on embedded systems and autonomous robotics.

231 followers · 360 following

San Francisco, CA
23:07 (UTC -08:00)
https://miguelvf.com
in/miguel-vf
@kaweees

Achievements

Highlights

Developer Program Member
Pro

Organizations

Lists (16)

Sort

Stars

35 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,968 3,404 Updated Jun 26, 2025

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,285 2,054 Updated Feb 2, 2026

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 9,000 1,108 Updated Feb 9, 2026

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,197 825 Updated Feb 25, 2026

Infatoshi / cuda-course

Cuda 3,317 589 Updated Feb 7, 2026

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,193 246 Updated Feb 24, 2026

computerhistory / AlexNet-Source-Code

This package contains the original 2012 AlexNet code.

Cuda 2,834 366 Updated Mar 12, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,825 257 Updated Feb 15, 2026

Tony-Tan / CUDA_Freshman

Cuda 2,695 505 Updated Jan 16, 2024

brucefan1983 / CUDA-Programming

Sample codes for my CUDA programming book

Cuda 2,010 383 Updated Dec 14, 2025

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,084 110 Updated Dec 30, 2024

siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Cuda 1,065 162 Updated Sep 2, 2025

olcf / cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 942 349 Updated Aug 19, 2024

CoffeeBeforeArch / cuda_programming

Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch

Cuda 942 179 Updated Jul 19, 2023

Celebrandil / CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

Cuda 935 298 Updated Oct 1, 2025

NVIDIA / multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 869 147 Updated Sep 26, 2025

NVIDIA / cuopt

GPU accelerated decision optimization

Cuda 721 127 Updated Feb 27, 2026

clu0 / unet.cu

UNet diffusion model in pure CUDA

Cuda 657 31 Updated Jun 28, 2024

CisMine / Parallel-Computing-Cuda-C

CUDA Learning guide

Cuda 531 63 Updated Jun 20, 2024

Infatoshi / mnist-cuda

Cuda 454 79 Updated Dec 18, 2025

Maharshi-Pandya / cudacodes

Learnings and programs related to CUDA

Cuda 433 20 Updated Jun 29, 2025

wangzyon / NVIDIA_SGEMM_PRACTICE

Step-by-step optimization of CUDA SGEMM

Cuda 433 57 Updated Mar 30, 2022

d0rc / egg.c

EGGROLL in C, integer-first training

Cuda 343 31 Updated Dec 22, 2025

MarvinChung / Orbeez-SLAM

Cuda 288 30 Updated Oct 9, 2023

leimao / CUDA-GEMM-Optimization

CUDA Matrix Multiplication Optimization

Cuda 261 24 Updated Jul 19, 2024

NVlabs / parrot

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without un…

Cuda 248 15 Updated Jan 29, 2026

R100001 / Programming-Massively-Parallel-Processors

Cuda 217 43 Updated Aug 2, 2024

qdLMF / LIO-SAM-GPU-ScanToMapOpt

A CUDA reimplementation of the line/plane odometry of LIO-SAM. A point cloud hash map (inspired by iVox of Faster-LIO) on GPU is used to accelerate 5-neighbour KNN search. Run on Jetson Orin NX 8GB.

Cuda 181 23 Updated Aug 24, 2025

drkennetz / cuda_examples

Some CUDA example code with READMEs.

Cuda 179 27 Updated Nov 11, 2025

salykova / sgemm.cu

High-Performance FP32 GEMM on CUDA devices

Cuda 117 8 Updated Jan 21, 2025

Miguel Villa Floran Kaweees

Highlights

Organizations

Lists (16)

AI Stuff

Business Card 💳

career

⭐ Dotfiles

dwm

💻 Embedded Systems

Emulator

Micromouse

🚀 My stack

💾 PCB Projects

🌐 Portfolio Inspiration

🤖 Robotics

suckless

Svelte

Tauri

Verilog HDL

Stars