| Algorithms | Variants |
|---|---|
| Random | bernoulli normal uniform |
| Quantization | symmetric per-block per-tensor q2 q4 q8 fp4 |
| Reduction | mean sum prod max min arg[max|min] per-cube per-plane |
| Matmul | mma unit tma multi-stage specialization ordered multi-rows |
| Convolution | mma unit tma multi-stage im2col |
| Attention | mma unit multi-rows |
-
Notifications
You must be signed in to change notification settings - Fork 5
CubeK: high-performance multi-platform kernels in CubeCL
License
Apache-2.0, MIT licenses found
Licenses found
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
tracel-ai/cubek
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
CubeK: high-performance multi-platform kernels in CubeCL
Topics
Resources
License
Apache-2.0, MIT licenses found
Licenses found
Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Stars
Watchers
Forks
Packages 0
No packages published