Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A composite Posit format and Torch + CUDA kernels for accelerated mixed-precision arithmetic

Notifications You must be signed in to change notification settings

zeroby0/composit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COMPOSIT

This repo has CUDA kernels for accelerating ComPosit quantisated arithmetic, and code to perform QAT on PyTorch models.

The kernels use tensor cores and half precision arithmetic, so the results and performance might be non-deterministic.

How to use / modify

Read these:

To make changes, you need nix. Do the multi-user installation: https://nixos.org/download/#nix-install-linux Then see the 'getting started' in https://github.com/huggingface/kernel-builder/blob/main/docs/nix.md and do the cachix step

# Use cachix without installing it
nix run nixpkgs#cachix -- use huggingface

nix develop .#devShells.torch28-cxx11-cu128-x86_64-linux
build2cmake generate-torch build.toml
python -m venv .venv
source .venv/bin/activate
pip install --no-build-isolation -e .

This should install the kernel as a package into the python in .venv Run your programs inside the shell that spawns when you ran nix develop above. If torch doesn't seem to work, run the first solution at https://danieldk.eu/Nix-CUDA-on-non-NixOS-systems#make-runopengl-driverlib-and-symlink-the-driver-library

Also see https://github.com/zeroby0/PyComposit, which is a simpler version that tries to do everything in Python

About

A composite Posit format and Torch + CUDA kernels for accelerated mixed-precision arithmetic

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published