Should we be Pre-Training ? Exploring End-Task Aware Training In Lieu of Continued Pre-training

This repository contains the source code for the paper Should we be Pre-Training ? Exploring End-Task Aware Training In Lieu of Continued Pre-training, by Lucio M Dery, Paul Michel, Ameet Talwalkar and Graham Neubig (ICLR 2022).

Links

Paper
Bibtext :

@inproceedings{
dery2022should,
title={Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative},
author={Lucio M. Dery and Paul Michel and Ameet Talwalkar and Graham Neubig},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=2bO2x8NAIMB}
}

Installation Instructions

This repo builds off the Don't Stop Pre-training paper repo here. Please follow their installation instructions. Repeated here for ease:

conda env create -f environment.yml
conda activate domains

Running

Our experiments were run on A6000 and A100 gpus which have > 40G gpu memory. To ensure that batches fit into memory, consider modifying the following variables

--classf_iter_batchsz   # Batch Size for primary task. Effective batchsize is  (classf_iter_batchsz * gradient_accumulation_steps)
--per_gpu_train_batch_size  # Batch Size for auxiliary tasks. Effective batchsize is  (per_gpu_train_batch_size * gradient_accumulation_steps)
--gradient_accumulation_steps # Number of timesteps to accumulate gradient over

To obtain results on sample datasets

Baseline

We used the TAPT baseline from the Don't Stop Pre-training paper. To reproduce this baseline, please follow the instructions in their repo here - to download and run their pre-trained models.

Ours

MT-TARTAN

./run_mt_multiple.sh {task} {output_dir} {gpuid} {startseed} {endseed}

META-TARTAN

./run_meta_multiple.sh {task} {output_dir} {gpuid} {startseed} {endseed}

MULTIPLE MLM AUXILIARY TASKS

Modify the following lines in *.sh files to allow mlm auxiliary tasks based on multiple datasets :

--train_data_file [file1 file2 file3] 
--aux-task-names  [MLM1 MLM2 MLM3]

note that the data used for DAPT auxiliary tasks can be found in datasets/{task}/domain.NxTAPT.txt

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
datasets		datasets
dont_stop_pretraining		dont_stop_pretraining
scripts		scripts
simple_vision_tartan		simple_vision_tartan
LICENSE		LICENSE
README.md		README.md
eatmt.png		eatmt.png
environment.yml		environment.yml
run_meta_multiple.sh		run_meta_multiple.sh
run_mt_multiple.sh		run_mt_multiple.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Should we be Pre-Training ? Exploring End-Task Aware Training In Lieu of Continued Pre-training

Links

Installation Instructions

Running

To obtain results on sample datasets

Baseline

Ours

MT-TARTAN

META-TARTAN

MULTIPLE MLM AUXILIARY TASKS

About

Uh oh!

Releases

Packages

Languages

License

ldery/TARTAN

Folders and files

Latest commit

History

Repository files navigation

Should we be Pre-Training ? Exploring End-Task Aware Training In Lieu of Continued Pre-training

Links

Installation Instructions

Running

To obtain results on sample datasets

Baseline

Ours

MT-TARTAN

META-TARTAN

MULTIPLE MLM AUXILIARY TASKS

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages