A high-performance deep learning framework with educational clarity
Genesis is a modern deep learning framework built from scratch, combining production-level performance with educational transparency. Featuring Triton-optimized kernels, automatic differentiation, and comprehensive neural network modules, Genesis serves both as a learning resource and a practical training framework.
Core Capabilities
- π₯ High Performance: Triton-optimized GPU kernels achieving near-native performance
- β‘ Automatic Differentiation: Dynamic computational graph with full gradient support
- π§ Neural Networks: Complete module library including transformers and attention mechanisms
- π― Mixed Precision: AMP support with FP16/BF16 training
- π Distributed Training: Multi-GPU training with NCCL backend
- π¦ Model Support: Built-in LLM implementations (Qwen) with training pipelines
Technical Highlights
- Modular backend system with clean CPU/CUDA separation
- Advanced CUDA memory management with pooling and statistics
- Unified operation dispatch routing to optimal implementations
- Complete optimizer suite (Adam, AdamW, SGD) with schedulers
- Production-ready training pipeline with checkpointing
Genesis delivers competitive performance through hand-optimized Triton kernels:
| Operation | Efficiency vs Reference |
|---|---|
| Matrix Multiplication | ~95% |
| Softmax | ~112% |
| LayerNorm | ~120% |
| Multi-Head Attention | ~97% |
Benchmarked on NVIDIA A100 GPU
# Clone repository
git clone https://github.com/phonism/genesis.git
cd genesis
# Install (CPU only)
pip install -e .
# Install with LLM support
pip install -e ".[llm]"
# Verify installation
python -c "import genesis; print(genesis.__version__)"import genesis
import genesis.nn as nn
import genesis.optim as optim
# Define model
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 256)
self.fc2 = nn.Linear(256, 10)
self.dropout = nn.Dropout(0.2)
def forward(self, x):
x = self.fc1(x).relu()
x = self.dropout(x)
return self.fc2(x)
# Training setup
model = Net()
optimizer = optim.AdamW(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()
# Training loop
for data, target in dataloader:
output = model(data)
loss = criterion(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()from genesis.cuda import amp
scaler = amp.GradScaler()
for data, target in dataloader:
with amp.autocast():
output = model(data)
loss = criterion(output, target)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
optimizer.zero_grad()# Single command for multi-GPU training
torchrun --nproc_per_node=4 train.pyimport genesis.distributed as dist
# Initialize
dist.init_process_group(backend='nccl')
# Wrap model
from genesis.distributed import DistributedDataParallel as DDP
model = DDP(model)
# Train normally - gradients synchronized automaticallygenesis/
βββ tensor.py # Core tensor with autograd
βββ function.py # Autodiff functions
βββ backends/ # CPU/CUDA implementations
β βββ cpu.py
β βββ cuda.py
β βββ cuda_memory.py
βββ ops/ # Operation dispatch
βββ nn/ # Neural network modules
β βββ modules/ # Layer implementations
β βββ functional.py # Functional operations
β βββ triton_ops/ # Optimized kernels
βββ optim/ # Optimizers
βββ distributed/ # Multi-GPU support
βββ cuda/ # CUDA utilities & AMP
Train Qwen LLM
cd apps/llm
python train_sft_qwen.py --amp --dtype fp16Interactive Chat
cd apps/llm
python chat_qwen.py --checkpoint model.pthBenchmarks
python benchmark/bench_matmul.py
python benchmark/bench_qwen_training.py# Run test suite
pytest tests/ -v
# With coverage
pytest tests/ --cov=genesis --cov-report=htmlWe welcome contributions! Genesis is designed to be hackable and educational.
# Development setup
pip install -e ".[dev]"
black genesis/ && isort genesis/
pytest tests/See CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
Genesis builds on ideas from PyTorch, Triton, TinyGrad, and JAX. We thank these projects for their inspiration and the deep learning community for their support.
Built for deep learning researchers and practitioners
β Star us on GitHub if you find Genesis useful!