web playground • tweet thread • blog post
Train language models in your browser with WebGPU-powered autodiff.
Sequence Toy is a web playground for training sequence models (Transformers, LSTMs, GRUs, and vanilla RNNs) using Piston, a proof-of-concept WebGPU automatic differentiation library. This repository houses these related projects.
- Piston is a fork of Ratchet, hacked and butchered to add automatic differentiation. I picked Ratchet because it is simple enough to reason about, but it thoughtfully supports WebGPU via wgpu.
- My implementation of backprop borrows heavily from Candle.
- The lazy execution model is an implementation of LazyTensor, and borrows from torch/csrc/lazy.
- I used Keller Jordan's Modded-NanoGPT as a reference for his Muon optimizer.
- My BPE tokenizer implementation is simplified from transformers.js, which in turn mirrors transformers.
- My GPT-2 model implementation was originally based on minGPT.
- I adapted dataset preprocessing code from llm.c.