An approach for foundation model finetuning in multi-modal heterogeneous federated learning. [Pre-print]
We propose Dual-Adapter Teacher (DAT) module and apply Mutual Knowledge Distillation (MKD) to mitigate the client local data heterogeneity in different modality.
- Create Conda environment with Python 3.8
conda create -n feddat python=3.8
conda activate feddat
- Install requirements
git clone https://github.com/HaokunChen245/FedDAT.git
pip install -r requirements.txt
pip install -U adapters
pip install accelerate
- Prepare datasets and pretrained-models
| Dataset | Link |
|---|---|
| AQUA | https://github.com/noagarcia/ArtVQA/tree/master/AQUA |
| COCO-QA | http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/cocoqa-2015-05-17.zip |
| Images for COCO-QA | https://cocodataset.org/#download |
| Abstract Scenes | https://visualqa.org/download.html |
| VizWiz | https://vizwiz.org/tasks-and-datasets/vqa/ |
| GQA | https://cs.stanford.edu/people/dorarad/gqa/download.html |
| VG_100K | https://huggingface.co/datasets/visual_genome |
| Function & Scene (CLOVE benchmark) | TODO |
Put the datasets in the folder /data
| Model | Link |
|---|---|
| ALBEF | https://storage.googleapis.com/sfr-pcl-data-research/ALBEF/ALBEF.pth |
| ViLT | https://huggingface.co/dandelin/vilt-b32-mlm |
| BERT | https://huggingface.co/bert-base-uncased/tree/main |
Put the models in the folder /models
# Training with ViLT
bash src/train_vilt.sh
# Training with ALBEF
bash src/train_albef.sh
@article{chen2023feddat,
title={FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning},
author={Chen, Haokun and Zhang, Yao and Krompass, Denis and Gu, Jindong and Tresp, Volker},
journal={arXiv preprint arXiv:2308.12305},
year={2023}
}