Small Language Models
This repo contains a from-scratch implementation of a transformer-based language model. This implementation allows training language models both on next-token and previous-token prediction. New features and upgrades will be made soon.
- Training and inference scripts for forward (next token prediction) and reverse (previous token prediction) GPT2-like language models.
- Dataloaders for the TinyShakespeare and TinyStories datasets.
Inspiered by and adapted from Andrej Karpathy's video series.