This is the source code for the paper Asynchronous Federated Optimization
The following python packages needs to be installed by pip:
- MXNET (we use Intel CPU cluster, thus mxnet-mkl is preferred)
- Gluon-CV
- Numpy
- Keras (with Tensorflow backend, we use this only for dataset preparation, not for model training)
- PIL (also for dataset preparation)
The users can simply run the following commond in their own virtualenv:
pip install --no-cache-dir numpy mxnet-mkl gluoncv keras pillow| Option | Desctiption |
|---|---|
| --output DATASET_DIR | the directory where the dataset will be placed |
| --nsplit 100 | partition to 100 devices |
| --normalize 1 | Normalize the data |
- partition CIFAR-10 dataset:
python convert_cifar10_to_np_normalized.py --nsplit 100 --normalize 1 --output DATASET_DIR| Option | Desctiption |
|---|---|
| --dir DATASET_DIR | the directory where the training dataset is placed |
| --valdir VAL_DATASET_DIR | the directory where the validation dataset is placed |
| --batchsize 50 | batch size of the workers |
| --epochs 2000 | total number of epochs |
| --interval 10 | log interval |
| --nsplit 100 | training data is partitioned to 100 devices |
| --lr 0.1 | learning rate |
| --rho 0.01 | regularization weight, different from the \rho in the paper, this is \gamma \times \rho |
| --alpha 0.8 | mixing hyperparameter |
| --log | path to the log file |
| --classes 10 | number of different classes/labels |
| --iterations 1 | number of local epochs in each global epoch |
| --alpha-decay | alpha decay rate |
| --alpha-decay-epoch | epochs where alpha decays |
| --alpha-type | type of adaptive alpha, options are 'none', 'power', 'hinge' |
| --alpha-adaptive | hyperparameter of adaptive alpha (a) |
| --alpha-adaptive2 | hyperparameter of adaptive alpha (b) |
| --max-delay | maximum of global delay |
| --model default | name of the model, "default" means the CNN used in the paper experiments |
| --seed 337 | random seed |
- Train with 100 workers/partitions, on default model:
python fed_async_paper/train_cifar10_mxnet_fedasync_singlethread_impl.py --classes 10 --model default --nsplit 100 --batchsize 50 --lr 0.1 --rho 0.01 --alpha 0.8 --alpha-decay 0.5 --alpha-decay-epoch 800 --epochs 2000 --max-delay 12 --iterations 1 --seed 336 --dir $inputdir --valdir $valdir -o $logfile 2>&1 | tee $watchfileMore detailed commands/instructions can be found in the demo script experiment_script_1.sh
The experiments are executed on the Intel vLab Academic Cluster, some environment variables in the script "experiment_script_1.sh" might be found unnecessary if run in different environments.