Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views13 pages

Day 9 - Transformer Revision

The document discusses various concepts related to tokenization, encoding, and embedding in the context of machine learning and natural language processing. It highlights the importance of self-attention mechanisms and autoregressive models in decoding sequences. Additionally, it touches on the use of positional encoding and the significance of dimensionality in embeddings.

Uploaded by

cpusingpython
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

Day 9 - Transformer Revision

The document discusses various concepts related to tokenization, encoding, and embedding in the context of machine learning and natural language processing. It highlights the importance of self-attention mechanisms and autoregressive models in decoding sequences. Additionally, it touches on the use of positional encoding and the significance of dimensionality in embeddings.

Uploaded by

cpusingpython
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Kent/teaches/at) Educosys


Pokenization
Tokens

Label
>
-
Encoding
not
② Encoding - are
one

%
meaning
Keet
teachet 01
1
10 0 ,
.

at
.

10 .
0 .
0 . 17
Educosys Glov
e)
wordzVec ,

Embedding
③ -
Positional
Paralle -
Embedding +

- Encoding
Educosys
⑲ at J
Teache 3 L
Kenti 34
sout
%
o
0.4 6
10
=

embedding
of
-
I Dimension
(i) I Tildmode) =
Sir
losp
por

pE(pos , -

PE(pos
hin = a)"( to e
Kent
I
pas
0
i=

= 6
Embeddingmension
Des S
- -

1 -
-
-

2
- 4 g
(
i= 0
·
L
>

ii sir-old)

o
M,o
in
iz ins

to
sil(i
&
words dimension
6
=
Embedding 6
din
-

DE
-dimension 6

~
T

I 426 Teaches
>
-
elp
At
>
-
not matrice K V
446 X
&
& , ,

vector
Input -

·
-
-
-

win
wa We Wio
-I

Keat was -
-
-
-

I
0 2 .

was -

-
--
1
0
nu
-

-
O M
-

0 .
9 Wi -

0 . 6 Wo

6 not motrive
40

e
①Ron cols
= S

1400
Keete

·
ey or

Flatir
Value Matric
Matrin ,

Key

Gr8
e

3 30 Query

new
words

reuter
S
Dim, ) =

F
IB
·?

detail and
#
&

18-4)(key)
Transpor

(804) a t
curs)
No
Na
to
tokens
No .
of
Reti

#
Q, 2 t

--- e tobes
cending
all
Due -
of
keys-
ak"
- lig
T
a

dimeni
1

softan( prob
Do tel
de that
probes =1

[Row
- -

-
Self Attention
Score
Soften
(R) .Uri
s
- Attention
Score
Self mat
the
on
sat
cat

Dime
T

.
3
E

gre
~
mods
a

]y
projector ai
Bata-sie
(No
DRDD An
An


All
dimension)
s
All

a
Su

sheads
O
4
=
Autoregressive
- past
values

Decoder
+ - en
-
~
self
depende
-----
4
***
Autongressive
--
--

Decoder
-

LSTM
-
,

RNN ,
blu Encodes Decode
Diff
,

-
A E
1 T

it
on
coup

#arti teache

Score
#

for future
tokens

O
Attention
+
scone
= Masked
- 0
=-

5)
Softmar)
-

Ia
C

scans)
(Masked
Softman
Masking
without
de
>
D

You might also like