Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Modern C++ incarnation of Karpathy's microgpt — pure autograd GPT in ~400 lines, zero dependencies

Notifications You must be signed in to change notification settings

XingfuY/microgpt-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

microgpt-cpp

The most atomic way to train and inference a GPT in modern C++.

A faithful C++20 incarnation of Andrej Karpathy's microgpt.py — a complete autograd engine, transformer, tokenizer, optimizer, training loop, and inference pipeline in a single header file.

Zero dependencies beyond the C++20 standard library.

What's Inside

microgpt.hpp (~370 lines) contains everything:

  • Autograd engineValue with shared_ptr computation graph, reverse-mode autodiff
  • Operators+, *, -, /, pow, log, exp, relu with full gradient support
  • Linear algebralinear(), softmax(), rmsnorm() on Vec/Mat types
  • GPT model — token/position embeddings, multi-head causal self-attention with KV cache, MLP, residual connections
  • Tokenizer — character-level with BOS token
  • Adam optimizer — with bias-corrected moments and learning rate decay
  • Training — cross-entropy loss, full backprop through the entire model
  • Inference — temperature-scaled autoregressive sampling

Quick Start

cmake -B build && cmake --build build

# Train on a names dataset
echo -e "alice\nbob\ncharlie\ndave\neve\nfrank\ngrace" > input.txt
./build/microgpt input.txt 500

# Run tests
./build/microgpt_test

Tests

27 Google Tests covering the full stack:

[==========] Running 27 tests from 6 test suites.
ValueForward    (9 tests)  — forward pass for all operations
ValueBackward   (9 tests)  — gradients + numerical gradient check
Components      (5 tests)  — linear, softmax, rmsnorm, softmax backward
Tokenizer       (1 test)   — encode/decode roundtrip
GPT             (2 tests)  — forward produces finite logits, KV cache grows
Training        (1 test)   — loss decreases over 100 steps
[  PASSED  ] 27 tests.

Configuration

Default config (tiny, CPU-friendly):

Parameter Value
n_embd 16
n_head 4
n_layer 1
block_size 16

Blog Post

microgpt-cpp: When You Rewrite Karpathy's 200 Lines of Python in C++ Because Why Not

Acknowledgments

Based on Andrej Karpathy's microgpt.py.

About

Modern C++ incarnation of Karpathy's microgpt — pure autograd GPT in ~400 lines, zero dependencies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published