pikaGPT: A tiny implementation of a GPT, accelerated for Apple Silicon
Built on picoGPT: a GPT in ~60 Lines of Numpy, using MLX: An array framework
for Apple Silicon
picoGPT: jaymody/picoGPT
Apple MLX: ml-explore/mlx
python pika.py "Alan Turing theorized that computers would one day become"
Returns something like:
generating: 100%|█████████████████████████| 40/40 [00:00<00:00, 51.76it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
Note: Models will be downloaded to /models if needed.
Change model size, tokens to generate like, or model directory:
python pika.py \
"Alan Turing theorized that computers would one day become" \
--n_tokens_to_generate 40 \
--model_size "124M" \
--models_dir "models"
To check against original Numpy implementation (non-MLX), add --numpy:
python pika.py \
"Alan Turing theorized that computers would one day become" \
--numpy
If Python>=3.12, first pip install setuptools to get distutils. See docs
pip install -r requirements.txt
Tested and benchmarked on Python 3.12.4 and macOS Sonoma 14.5 (M1 Pro, 32GB)
Main script is pika.py, which imports encoder.py (from OpenAI) and
downloads model files with utils.py
pikaGPT is based on picoGPT which is "an unnecessarily tiny and minimal
implementation of GPT-2 in plain NumPy. The entire forward pass code is 40
lines of code."
For more, see picoGPT: jaymody/picoGPT
If/where to add mx.compile?
MLX seems to provide >4x speedup, see iterations/second it/s etc:
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become"
generating: 100%|██████████████████████████| 40/40 [00:00<00:00, 53.03it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --numpy
generating: 100%|██████████████████████████| 40/40 [00:04<00:00, 9.54it/s]
the most powerful machines on the planet.
The computer is a machine that can perform complex calculations, and it can perform these calculations in a way that is very similar to the human brain.
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --model_size "1558M"
generating: 100%|██████████████████████████| 40/40 [00:06<00:00, 6.32it/s]
so powerful that they would be able to think like humans.
In the 1950s, he proposed a way to build a computer that could think like a human. He called it the "T
(.venv) pikaGPT# python pika.py "Alan Turing theorized that computers would one day become" --model_size "1558M" --numpy
generating: 100%|██████████████████████████| 40/40 [00:43<00:00, 1.10s/it]
so powerful that they would be able to think like humans.
In the 1950s, he proposed a way to build a computer that could think like a human. He called it the "Tpip install -r requirements_dev.txt
Run some tests with make test