A miniature autograd engine designed for educational purposes, implementing automatic differentiation in pure NumPy.
Note
Status: Active Development
Not production-ready - created for learning purposes
"Chibi" (ใกใณ) means "small" or "miniature" in Japanese. This is a tiny autograd engine designed for learning purposes - hence ChibiGrad!
- ๐งฎ Tensor operations with automatic differentiation
- ๐ Neural network layers (Linear)
- ๐ง Basic operations (Add, Multiply, Power, etc.)
- ๐ Loss functions (MSE) (more to come)
- ๐ GPU-free (NumPy based)
- โ PyTorch-like API for easier learning
chibigrad requires Python 3.8+ and NumPy. Clone and install dependencies:
- Clone the Repository
git clone https://github.com/sumitdotml/chibigrad.git
cd chibigrad- Create Virtual Environment
python -m venv .venv # Create virtual environment
source .venv/bin/activate # Activate (Linux/Mac)
.venv\Scripts\activate # Activate (Windows)- Install Dependencies
pip install -r requirements.txt- Clone and Set Up Virtual Environment
python -m venv .venv # Create virtual environment
source .venv/bin/activate # Activate (Linux/Mac)
.venv\Scripts\activate # Activate (Windows)- Editable Install with Development Dependencies
pip install -e ".[tests]" # Install package in editable mode with test deps- Verify Installation
python -c "import chibigrad; print(chibigrad.__version__)"
# Should output: 0.1.0- Development Workflow
# Install pre-commit hooks (optional but recommended)
pre-commit install
# Run tests after changes
python -m tests.check --test all
# Reinstall after major changes
pip install -e . --force-reinstall| Dependency Group | Packages | Purpose |
|---|---|---|
| core | numpy, rich |
Core functionality |
| tests | torch |
Gradient comparison tests |
| dev | black, flake8, mypy |
Code quality (optional) |
To add development dependencies:
pip install black flake8 mypy # Code formatting and lintingfrom chibigrad.tensor import Tensor
import numpy as np
# Create tensors
x = Tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
y = Tensor([[2.0, 1.0], [4.0, 3.0]], requires_grad=True)
# Basic arithmetic
z = x + y # Addition
w = x * y # Element-wise multiplication
m = x @ y # Matrix multiplication
# Reduction operations
mean = x.mean()
sum_x = x.sum()
# Activation functions
activated = x.relu() # ReLU activation
# Backward pass
loss = (z ** 2).mean()
loss.backward()
# Access gradients
print(x.grad) # Gradients for xfrom chibigrad.tensor import Tensor
from chibigrad.linear import Linear
from chibigrad.loss import MSELoss
# Create model
class SimpleNN:
def __init__(self):
self.linear1 = Linear(2, 4)
self.linear2 = Linear(4, 1)
def forward(self, x):
x = self.linear1(x)
x = x.relu()
return self.linear2(x)
# Training data
X = Tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
y_true = Tensor([[3.0], [7.0]])
# Model and loss
model = SimpleNN()
criterion = MSELoss()
# Forward pass
y_pred = model.forward(X)
loss = criterion(y_pred, y_true)
# Backward pass
loss.backward()Runs essential smoke tests:
- Basic network functionality
- Memory management
- Numerical stability
- Broadcasting operations
# Run specific test files
python -m tests.test_operations # Basic operations
python -m tests.test_training # Training & optimization
python -m tests.check # Sanity checks
# Run all tests
python -m pytest tests/chibigrad/
โโโ chibigrad/ # Core autograd engine implementation
โ โโโ tensor.py # Tensor class with automatic differentiation
โ โโโ operation.py # Base class for all operations
โ โโโ arithmetic.py # Arithmetic operations (Add, Multiply, etc.)
โ โโโ matmul.py # Matrix multiplication operation
โ โโโ linear.py # Neural network Linear layer
โ โโโ loss.py # Loss functions (MSE currently)
โ โโโ activations.py # Activation functions (Placeholder for now, will be added soon)
โ โโโ optim.py # Optimizers (Placeholder for now, will be added soon)
โ โโโ module.py # Base class for neural network modules
โโโ tests/ # Comprehensive test suite
โ โโโ check.py # Gradient comparison tests against PyTorch
โโโ requirements.txt # Python dependencies
โโโ setup.py # Package installation configuration
โโโ README.md # you are here
For a linear layer
| Gradient | Formula |
|---|---|
- Tensor operations
- Linear layer
- Backward pass
- MSE loss
- Broadcasting in backward pass
- Working Linear Layer
- Activation functions (ReLU, Sigmoid)
- Optimizers (SGD, Adam)
- Convolutional layers
- More robust tests
Warning
Disclaimer: This is a toy project for learning purposes. For production use, consider established frameworks like PyTorch or TensorFlow.
-
Tensor Class (
tensor.py)- Core data structure tracking computational graph
- Handles automatic differentiation via backward passes
- Supports common operations (+, *, @, etc.) with operator overloading
- Manages gradient computation and broadcasting
-
Operations System
operation.py: Base class for all operationsarithmetic.py: Elementary math operations with gradient rules- Add, Multiply, Power, Sum, Mean
matmul.py: Matrix multiplication with broadcasting support
-
Neural Network Components
linear.py: Fully-connected layer implementation- Xavier initialization for weights
- Proper gradient tracking through matrix operations
loss.py: Mean Squared Error (MSE) implementation- Batch-aware gradient computation
- Efficient computation graph construction
-
Testing Infrastructure
- Gradient comparison tests against PyTorch
- Detailed numerical validation
- Rich terminal output for test results
- Three test modes: arithmetic, mse, and all
- Optimized for learning and clarity over speed
- Memory-efficient tensor operations
- Automatic gradient cleanup
- Comparable performance to PyTorch for small to medium network