This example trains a multi-layer RNN (Quasi-RNN, GRU, or LSTM) on a language modeling task. By default, the training script uses the PTB dataset, provided. The trained model can then be used by the generate script to generate new text. This is a porting of pytorch/examples/word_language_model making it usables on FloydHub.
The main.py script accepts the following arguments:
optional arguments:
-h, --help show this help message and exit
--data DATA location of the data corpus
--model MODEL type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)
--emsize EMSIZE size of word embeddings
--nhid NHID number of hidden units per layer
--nlayers NLAYERS number of layers
--lr LR initial learning rate
--optlr learning rate for optimizer
--clip CLIP gradient clipping
--epochs EPOCHS upper epoch limit
--batch-size N batch size
--adasoft activate adaptive softmax
--bptt BPTT sequence length
--pre pre-trained weight (200 or 300 emsize if using)
--dropout DROPOUT dropout applied to layers (0 = no dropout)
--decay DECAY learning rate decay per epoch
--tied tie the word embedding and softmax weights
--seed SEED random seed
--cuda use CUDA
--log-interval N report interval
--save SAVE path to save the final modelWith these arguments, a variety of models can be tested. As an example, the following arguments produce slower but better models:
python main.py --cuda --emsize 300 --nhid 300 --dropout 0.2 --epochs 5 # Test perplexity of 98.73These perplexities are equal or better than Recurrent Neural Network Regularization (Zaremba et al. 2014) and are similar to Using the Output Embedding to Improve Language Models (Press & Wolf 2016 and Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling (Inan et al. 2016), though both of these papers have improved perplexities by using a form of recurrent dropout (variational dropout).
Here's the commands to training, evaluating and serving your language modeling task on FloydHub.
Before you start, log in on FloydHub with the floyd login command, then fork and init the project:
$ git clone https://github.com/trexwithoutt/word-language-model.git
$ cd word-language-modelglove download from (https://github.com/3Top/word2vec-api)[https://github.com/3Top/word2vec-api]
Some useful resources on NLP for Deep Learning and language modeling task: