Shallowspeed

A tiny POC implementation of distributed training for sequential deep learning models. Implemented using plain Numpy & mpi4py.

Currently implements:

Sequential models / deep MLPs, training using SGD.
Data parallel training with interleaved communication & computation, similar to PyTorch's DistributedDataParallel.
Pipeline parallel training:
- Naive schedule without interleaved stages.
- Gpipe schedule with interleaved FWD & interleaved BWD.
- (soon) PipeDream Flush schedule with additional inter-FWD & BWD interleaving.
Any combination of DP & PP algorithms.

Setup

conda env create
pip install -e .
# M1 Macs: conda install "libblas=*=*accelerate"
python download_dataset.py
pytest

Usage

# Sequential training
python train.py
# Data parallel distributed training
mpirun -n 4 python train.py --dp 4
# Pipeline parallel distributed training
mpirun -n 4 python train.py --pp 4 --schedule naive
# Data & pipeline parallel distributed training
mpirun -n 8 python train.py --dp 2 --pp 4 --schedule gpipe

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github/assets		.github/assets
data		data
scripts		scripts
shallowspeed		shallowspeed
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
download_dataset.py		download_dataset.py
environment.yml		environment.yml
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shallowspeed

Setup

Usage

Internals

About

Uh oh!

Uh oh!

Contributors 1

Languages

siboehm/ShallowSpeed

Folders and files

Latest commit

History

Repository files navigation

Shallowspeed

Setup

Usage

Internals

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 1

Languages