This repository contains several examples and tutorials that showcase usage of PhysicsNeMo's ShardTensor utility.
NOTE These examples will shortly be upstreamed into the PhysicsNeMo Example repository - bug fixes / new examples will appear there, not here.
The contents of the repository are:
-
Vector Addition - See how to use
ShardTensorfor basic domain parallelism, in an operation that requires no collectives. -
Vector Dot Product - See how to extend an operation with a collective reduction to compute a doct product over distributed tensors.
-
kNN - Parallelize a more complicated and challenging operation with a ring passing scheme.
-
Convolution - See how to apply a loss function and backward pass for domain parallel operations, and validate numerical accuracy and gradient placements.
-
ViT - Learn how to implement a fully training loop with domain parallelism, and benchmark computational speed and memory usage. Shows the differences in the training script for a single-GPU, 1D (DDP) and 2D (ShardTensor + FSDP) parallelism.
Learn more about the tools used in these examples: