This repository contains code implementations for several research papers, including [LSAM], [ESGD], [LSGD], [DP-SAM] and [DP-SGD].
These methods can be interpreted from a sampling-based perspective, depending on whether they modify the loss landscape or not. The repository is currently under active development.
The project only requires naive PyTorch. Install PyTorch from official website.
-
On the first GPU node, use command
ifconfigin terminal to check its IP address; -
Open the bash file
train.sh: fill in ip_addr with the IP address you obtained from step 1 and num_groups with the total number of GPU nodes you have (i.e. num_groups=n); -
On i-th GPU note, in terminal type
bash train.sh [i-1](i.e. index starts from 0) to run the codes.
-
Type
nvidia-smiin terminal to check nvidia driver and cuda compatibility; -
Check consistency of PyTorch versions across the machines.