- San Francisco Bay Area
Lists (2)
Sort Name ascending (A-Z)
Stars
Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.
Causal depthwise conv1d in CUDA, with a PyTorch interface
intel / sycl-tla
Forked from NVIDIA/cutlassSYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection
Curated collection of papers in MoE model inference
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Tile primitives for speedy kernels
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
oneAPI - Data Parallel C++ course for students
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
A batched offline inference oriented version of segment-anything
Applied AI experiments and examples for PyTorch
Helpful tools and examples for working with flex-attention
Run PyTorch LLMs locally on servers, desktop and mobile
A PyTorch native platform for training generative AI models
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
Train transformer language models with reinforcement learning.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step