|
28 | 28 | all of its inputs to be 3D tensors. The semantics of the axes of these |
29 | 29 | tensors is important. The first axis is the sequence itself, the second |
30 | 30 | indexes instances in the mini-batch, and the third indexes elements of |
31 | | -the input. We haven't discussed mini-batching, so lets just ignore that |
| 31 | +the input. We haven't discussed mini-batching, so let's just ignore that |
32 | 32 | and assume we will always have just 1 dimension on the second axis. If |
33 | 33 | we want to run the sequence model over the sentence "The cow jumped", |
34 | 34 | our input should look like |
|
95 | 95 | # In this section, we will use an LSTM to get part of speech tags. We will |
96 | 96 | # not use Viterbi or Forward-Backward or anything like that, but as a |
97 | 97 | # (challenging) exercise to the reader, think about how Viterbi could be |
98 | | -# used after you have seen what is going on. |
| 98 | +# used after you have seen what is going on. In this example, we also refer |
| 99 | +# to embeddings. If you are unfamiliar with embeddings, you can read up |
| 100 | +# about them `here <https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html>`__. |
99 | 101 | # |
100 | 102 | # The model is as follows: let our input sentence be |
101 | 103 | # :math:`w_1, \dots, w_M`, where :math:`w_i \in V`, our vocab. Also, let |
@@ -127,16 +129,19 @@ def prepare_sequence(seq, to_ix): |
127 | 129 |
|
128 | 130 |
|
129 | 131 | training_data = [ |
| 132 | + # Tags are: DET - determiner; NN - noun; V - verb |
| 133 | + # For example, the word "The" is a determiner |
130 | 134 | ("The dog ate the apple".split(), ["DET", "NN", "V", "DET", "NN"]), |
131 | 135 | ("Everybody read that book".split(), ["NN", "V", "DET", "NN"]) |
132 | 136 | ] |
133 | 137 | word_to_ix = {} |
| 138 | +# For each words-list (sentence) and tags-list in each tuple of training_data |
134 | 139 | for sent, tags in training_data: |
135 | 140 | for word in sent: |
136 | | - if word not in word_to_ix: |
137 | | - word_to_ix[word] = len(word_to_ix) |
| 141 | + if word not in word_to_ix: # word has not been assigned an index yet |
| 142 | + word_to_ix[word] = len(word_to_ix) # Assign each word with a unique index |
138 | 143 | print(word_to_ix) |
139 | | -tag_to_ix = {"DET": 0, "NN": 1, "V": 2} |
| 144 | +tag_to_ix = {"DET": 0, "NN": 1, "V": 2} # Assign each tag with a unique index |
140 | 145 |
|
141 | 146 | # These will usually be more like 32 or 64 dimensional. |
142 | 147 | # We will keep them small, so we can see how the weights change as we train. |
|
0 commit comments