ShardTensorExamples

This repository contains several examples and tutorials that showcase usage of PhysicsNeMo's ShardTensor utility.

NOTE These examples will shortly be upstreamed into the PhysicsNeMo Example repository - bug fixes / new examples will appear there, not here.

The contents of the repository are:

Vector Addition - See how to use ShardTensor for basic domain parallelism, in an operation that requires no collectives.
Vector Dot Product - See how to extend an operation with a collective reduction to compute a doct product over distributed tensors.
kNN - Parallelize a more complicated and challenging operation with a ring passing scheme.
Convolution - See how to apply a loss function and backward pass for domain parallel operations, and validate numerical accuracy and gradient placements.
ViT - Learn how to implement a fully training loop with domain parallelism, and benchmark computational speed and memory usage. Shows the differences in the training script for a single-GPU, 1D (DDP) and 2D (ShardTensor + FSDP) parallelism.

Resources

Learn more about the tools used in these examples:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
1_vector_addition.py		1_vector_addition.py
2_vector_dot_product		2_vector_dot_product
3_knn		3_knn
4_convolution		4_convolution
5_vit_training_loop		5_vit_training_loop
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md