0% found this document useful (0 votes)

425 views12 pages

Machine Learning: RNN, LSTM, GRU, Translation

This document provides notes on machine learning unit IV topics including recurrent neural networks, long short-term memory, gated recurrent units, translation models, and reinforcement learning frameworks. The key concepts covered are recurrent neural network architecture and training, long short-term memory units to address RNN limitations, gated recurrent units as a simpler variant of LSTMs, and statistical and neural machine translation models using encoder-decoder architectures.

Uploaded by

mohit jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

425 views12 pages

Machine Learning: RNN, LSTM, GRU, Translation

Uploaded by

mohit jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Chameli Devi Group of Institutions, Indore

Department of Computer Science and Engineering

Subject Notes
CS 601- Machine Learning
UNIT-IV

Syllabus: Recurrent neural network, Long short-term memory, Gated recurrent unit,
Translation, Beam search and width, Bleu score, Attention model, Reinforcement Learning, RL-
Framework, MDP, Bellman equations, Value Iteration and Policy Iteration, Actor-critic model, Q-
learning, SARSA

Course Outcome: Student will be able to apply Recurrent neural network principles in model
building for related real-world problems.

Recurrent neural network

Recurrent Neural Network (RNN) is a type of Neural Network where the output from previous
step is fed as input to the current step. In traditional neural networks, all the inputs and outputs
are independent of each other, but in cases like when it is required to predict the next word of
a sentence, the previous words are required and hence there is a need to remember the
previous words. Thus, RNN came into existence, which solved this issue with the help of a
Hidden Layer. The main and most important feature of RNN is Hidden state, which remembers
some information about a sequence.

Figure 4.1: Recurrent neural network

RNN have a “memory” which remembers all information about what has been calculated. It
uses the same parameters for each input as it performs the same task on all the inputs or
hidden layers to produce the output. This reduces the complexity of parameters, unlike other
neural networks.
The formula for the current state can be written as –
ht =f ( ht −1 , xt )
Here, ht is the new state; ht-1 is the previous state while xt is the current input. We now have a
state of the previous input instead of the input itself, because the input neuron would have
applied the transformations on our previous input. So each successive input is called as a time
step.
Taking the simplest form of a recurrent neural network, let’s say that the activation function is
tanh, the weight at the recurrent neuron is W hh and the weight at the input neuron is W xh, we
can write the equation for the state at time t as –
ht =tanh ( W hh ht−1 +W ( xh ) X )
t

Recurrent neuron in this case is just taking the immediate previous state into consideration. For
longer sequences the equation can involve multiple such states. Once the final state is
calculated we can go on to produce the output.
Now, once the current state is calculated we can calculate the output state as-
Y t =W ( hy ) . ht
Training through RNN
1. A single time step of the input is supplied to the network i.e., x t is supplied to the
network
2. We then calculate its current state using a combination of the current input and the
previous state i.e. we calculate ht
3. The current ht becomes ht-1 for the next time step
4. We can go as many time steps as the problem demands and combine the information
from all the previous states
5. Once all the time steps are completed the final current state is used to calculate the
output yt
6. The output is then compared to the actual output and the error is generated
7. The error is then back propagated to the network to update the weights and the
network is trained

Long short-term memory

Recurrent Neural Networks suffer from short-term memory. If a sequence is long enough,
they’ll have hard time carrying information from earlier time steps to later ones. So if you are
trying to process a paragraph of text to do predictions, RNN’s may leave out important
information from the beginning.
Long Short Term Memory is a kind of recurrent neural network is the solution of the above
problem. In RNN output from the last step is fed as input in the current step. LSTM was
designed by Hochreiter & Schmidhuber. It tackled the problem of long-term dependencies of
RNN in which the RNN cannot predict the word stored in the long-term memory but can give
more accurate predictions from the recent information. LSTM can by default retain the
information for long period of time. It is used for processing, predicting and classifying on the
basis of time series data.
Structure of LSTM:
LSTM has a chain structure that contains four neural networks and different memory blocks
called cells.
Figure 4.2: LSTM Network
Information is retained by the cells and the memory manipulations are done by the gates.
There are three gates –
1. Forget gate: The information that no longer useful in the cell state is removed with the
forget gate. Two inputs x_t (input at the particular time) and h_t-1 (previous cell output) are
fed to the gate and multiplied with weight matrices followed by the addition of bias. The
resultant is passed through an activation function which gives a binary output. If for a
particular cell state the output is 0, the piece of information is forgotten and for the output 1,
the information is retained for the future use.

Figure 4.3: Forget gate

2.Input gate: Addition of useful information to the cell state is done by input gate. First, the
information is regulated using the sigmoid function and filter the values to be remembered
similar to the forget gate using inputs h_t-1 and x_t. Then, a vector is created
using tanh function that gives output from -1 to +1, which contains all the possible values
from h_t-1 and x_t. At last, the values of the vector and the regulated values are multiplied to
obtain the useful information.

Figure 4.4: Input gate

3. Output gate: The task of extracting useful information from the current cell state to be
presented as an output is done by output gate. First, a vector is generated by applying tanh
function on the cell. Then, the information is regulated using the sigmoid function and filter
the values to be remembered using inputs h_t-1 and x_t. At last, the values of the vector
and the regulated values are multiplied to be sent as an output and input to the next cell.

Figure 4.5: Output gate

Gated recurrent unit

Gated Recurrent Units (GRU) is one of the popular variants of recurrent neural networks and
has been widely used in the context of machine translation. GRUs can also be regarded as a
simpler version of LSTMs (Long Short-Term Memory). A gated recurrent unit (GRU) was
introduced to produce each recurrent unit to capture dependencies of various time scales.
How it works?
In GRU, two gates including a reset gate that adjusts the incorporation of new input with the
previous memory and an update gate that controls the preservation of the precious memory
are introduced. The reset gate and the update gate adaptively control how much each hidden
unit remembers or forgets while reading/generating a sequence.

Figure 4.6: Gated recurrent unit

In the above figure of the Gated Recurrent Unit, r and z are known to be the reset and update
gates, while h and h˜ are the activations as well as the candidate activation respectively. The
working of GRU proceeds such that when the reset gate is close to zero, the hidden state is
forced to ignore the previous hidden state and is reset with the current input.
This allows the hidden state to discard any data that is found to be irrelevant in the future. This
result allows a more compact representation. The update gate controls how much data from
the previous hidden state will be transferred to the current hidden state. This process performs
in a similar manner to the memory cell in the Long Short-Term Memory network and helps the
RNN to remember long-term information.
Advantages of Gated Recurrent Unit
Gated Recurrent Unit can be used to improve the memory capacity of a recurrent neural
network as well as provide the ease of training a model. The hidden unit can also be used for
settling the vanishing gradient problem in recurrent neural networks. It can be used in various
applications, including speech signal modeling, machine translation, and handwriting
recognition, among others.

Translation
Machine translation is the task of automatically converting source text in one language to text
in another language.
In a machine translation task, the input already consists of a sequence of symbols in some
language, and the computer program must convert this into a sequence of symbols in another
language. Given a sequence of text in a source language, there is no one single best translation
of that text to another language. This is because of the natural ambiguity and flexibility of
human language. The fact is that accurate translation requires background knowledge in order
to resolve ambiguity and establish the content of the sentence.
Here is the list of machine translation:
Statistical Machine Translation-
Statistical machine translation, or SMT for short, is the use of statistical models that learn to
translate text from a source language to a target language. The approach is data-driven,
requiring only a corpus of examples with both source and target language text. This means
linguists are no longer required to specify the rules of translation.
Neural Machine Translation-
Neural machine translation, or NMT for short, is the use of neural network models to learn a
statistical model for machine translation.
The key benefit to the approach is that a single system can be trained directly on source and
target text, no longer requiring the pipeline of specialized systems used in statistical machine
learning.
Encoder-Decoder Model
Multilayer Perceptron neural network models can be used for machine translation, although
the models are limited by a fixed-length input sequence where the output must be the same
length.
These early models have been greatly improved upon recently through the use of recurrent
neural networks organized into encoder-decoder architecture that allow for variable length
input and output sequences.

Beam search and width

Another popular heuristic is the beam search that expands upon the greedy search and returns
a list of most likely output sequences. Instead of greedily choosing the most likely next step as
the sequence is constructed, the beam search expands all possible next steps and keeps the k
most likely, where k is a user-specified parameter and controls the number of beams or parallel
searches through the sequence of probabilities.
The local beam search algorithm keeps track of k states rather than just one. It begins with k
randomly generated states. At each step, all the successors of all k states are generated. If one
of them is a goal, the algorithm halts. Otherwise, it selects the k best successors from the
complete list and repeats. We do not need to start with random states; instead, we start with
the k most likely words as the first step in the sequence.
Common beam width values are 1 for a greedy search and values of 5 or 10 for common
benchmark problems in machine translation. Larger beam widths result in better performance
of a model as the multiple candidate sequences increase the likelihood of better matching a
target sequence. This increased performance results in a decrease in decoding speed.
The search process can halt for each candidate separately either by reaching a maximum length,
by reaching an end-of-sequence token, or by reaching a threshold likelihood.
Example:
We can define a function to perform the beam search for a given sequence of probabilities and
beam width parameter k. At each step, each candidate sequence is expanded with all possible
next steps. Each candidate step is scored by multiplying the probabilities together. The k
sequences with the most likely probabilities are selected and all other candidates are pruned.
The process then repeats until the end of the sequence.

Bleu score: The bilingual evaluation understudy (Bleu) score quantifies how good a machine
translation is by computing a similarity score based on n-gram precision. It is a measurement of
the differences between an automatic translation and one or more human-created reference
translations of the same source sentence.
It is defined as follows:
❑
∑ count clip . ( n−gram )
pn= ❑
❑
∑
❑
count . ( n−gram )

where, pn is the Bleu score on n-gram

The BLEU algorithm compares consecutive phrases of the automatic translation with the
consecutive phrases it finds in the reference translation, and counts the number of matches, in
a weighted fashion. These matches are position independent. A higher match degree indicates a
higher degree of similarity with the reference translation, and higher score. Intelligibility and
grammatical correctness are not taken into account.

Attention model
Attention is proposed as a solution to the limitation of the Encoder-Decoder model encoding
the input sequence to one fixed length vector from which to decode each output time step. This
issue is believed to be more of a problem when decoding long sequences.

Figure 4.7: Attention Model

This model allows an RNN to pay attention to specific parts of the input that is considered as
being important, which improves the performance of the resulting model in practice. By
noting α<t,t′> the amount of attention that the output y<t> should pay to the activation a<t′> and c<t>
the context at time t, we have:

❑ ❑
()
¿t >¿= ∑ n a¿t ,t > ¿a
with ∑ a
'
'
¿t ,t >¿
' ¿t > ¿
¿
¿¿
k =1 ¿
c t'
t'

Attention weight The amount of attention that the output y<t>should pay to the activation a<t′> is
given by α<t,t′> computed as follows:

exp ¿ ¿ ¿

Reinforcement Learning
Reinforcement learning is a branch of machine learning that is concerned to take a sequence of
actions in order to maximize some reward.
Basically an RL does not know anything about the environment; it learns what to do by
exploring the environment. It uses actions, and receives states and rewards. The agent can only
change your environment through actions.
Concepts:
 Agents take actions in an environment and receive states and rewards
 Goal is to find a policy that maximize its utility function
 Inspired by research on psychology and animal learning

Figure 4.8: Reinforcement Learning

Here we don't know which actions will produce rewards, also we don't know when an action
will produce rewards, and sometimes you do an action that will take time to produce rewards.
Basically all is learned with interactions with the environment.
Reinforcement learning components:
 Agent: Our robot
 Environment: The game, or where the agent lives.
 A set of states
 Policy: Map between state to actions
 Reward Function : Gives immediate reward for each state
 Value Function: Gives the total amount of reward the agent can expect from a particular
state to all possible states from that state. With the value function you can find a policy.
 Model (Optional): Used to do planning, instead of simple trial-and-error approach
common to Reinforcement learning. Here means the possible state after we do an
action on the state
RL-Framework
Following are the top-5 RL-Frameworks available:
1. Acme
About: Acme is a framework for distributed reinforcement learning introduced by DeepMind.
The framework is used to build readable, efficient, research-oriented RL algorithms. At its core,
Acme is designed to enable simple descriptions of RL agents that can be run at various scales of
execution, including distributed agents. This framework aims to make the results of various RL
algorithms developed in academia and industrial labs easier to reproduce and extend for the
machine learning community at large.
2. DeeR
About: DeeR is a Python library for deep reinforcement learning. The framework is built with
modularity in mind so that it can easily be adapted to any need and provides many possibilities
such as Double Q-learning, Prioritized Experience Replay, Deep deterministic Policy Gradient
(DDPG), and Combined Reinforcement via Abstract Representations (CRAR).
3. Dopamine
About: Dopamine is a popular research framework for fast prototyping of reinforcement
learning algorithms. The framework aims to fill the need for a small, easily codebase in which
users can freely experiment with research. The design principles of this framework include
flexible development, reproducibility, easy experimentation and more.
4. Frap
About: Frap or Framework for Reinforcement learning And Planning is unifying that identifies
the underlying dimensions on which any planning or learning algorithm has to decide. The
framework provides deeper insight into the algorithmic space of planning and reinforcement
learning and also suggests new approaches to integrate both the fields. The aim of this
framework is to provide a common language to categorize algorithms as well as it identifies
new research directions.
5. RLgraph
About: RLgraph is a reinforcement learning framework that quickly prototypes, defines and
executes reinforcement learning algorithms both in research and practice. The framework
supports TensorFlow (or static graphs in general) or eager/define-by-run execution (PyTorch)
through a single component interface. Using RLgraph, developers can combine high-level
components in a space-independent manner and define input spaces.

MDP (Markov decision process)

MDP is a framework that can solve most Reinforcement Learning problems with discrete
actions. With the Markov Decision Process, an agent can arrive at an optimal policy for
maximum rewards over time.
The Markov decision process, better known as MDP, is an approach in reinforcement learning to
take decisions in a grid world environment. A grid world environment consists of states in the
form of grids.
The aim of MDP is to train an agent to find a policy that will return the maximum cumulative
rewards from taking a series of actions in one or more states.
Figure 4.9: Markov Decision Process

Here are the most important parts:

 States: A set of possible states
 Model: Probability to go to state when you do the action while you were on state. It is
also called transition model.
 Action: Things that you can do on a particular state
 Reward: Scalar value that you get for been on a state.
 Policy: It is a map that tells the optimal action for every state
 Optimal policy: It is a policy that maximize your expected reward

Bellman equations
The agent tries to get the most expected sum of rewards from every state it lands in. In order to
achieve that we must try to get the optimal value function, i.e. the maximum sum of cumulative
rewards. Bellman equation will help us to do so.
Using Bellman equation, the value function will be decomposed into two part; an immediate
reward, Rt+1, and discounted value of the successor state 𝛾V(St+1),
v(s) = 𝔼 [ Gt | St = s]
We unroll the return Gt,

= 𝔼 [Rt+1 + 𝛾Rt+2 + 𝛾2Rt+3 + … | St = s]

= 𝔼 [Rt+1 + 𝛾Rt+2 + 𝛾Rt+3 + … | St = s]
then substitute the return Gt+1, starting from time step t+1,

= 𝔼 [Rt+1 + 𝛾Gt+1 | St = s]
Finally, since the expected value function is a linear function, meaning that (aX+bY)= a𝔼(X)
+b𝔼(Y). The expected value of the return Gt+1 is the value of the state St+1,

= 𝔼 [Rt+1 + 𝛾v(St+1) | St = s]
That gives us the Bellman equation for MRPs,
v(s)= 𝔼 [Rt+1 + 𝛾v(St+1) | St = s]
So, for each state in the state space, the Bellman equation gives us the value of that state,
❑
v ( s )=R s +γ ∑ Pss' v ( s ' )
'
s∈S

The value of the state S is the reward we get upon leaving that state, plus a discounted average
over next possible successor states, where the value of each possible successor state is
multiplied by the probability that we land in it.

Value Iteration and Policy Iteration

The value-iteration and policy-iteration algorithms are two fundamental methods for solving
MDPs. Both value-iteration and policy-iteration assume that the agent knows the MDP model of
the world (i.e. the agent knows the state-transition and reward probability functions).
Therefore, they can be used by the agent to (offline) plan its actions given knowledge about the
environment before interacting with it.
Value iteration computes the optimal state value function by iteratively improving the estimate
of V(s). The algorithm initializes V(s) to arbitrary random values. It repeatedly updates the Q(s,
a) and V(s) values until they converges. Value iteration is guaranteed to converge to the optimal
values.
While value-iteration algorithm keeps improving the value function at each iteration until the
value-function converges. Since the agent only cares about the finding the optimal policy,
sometimes the optimal policy will converge before the value function. Therefore, another
algorithm called policy-iteration instead of repeated improving the value-function estimate; it
will re-define the policy at each step and compute the value according to this new policy until
the policy converges.
Policy iteration is also guaranteed to converge to the optimal policy and it often takes less
iteration to converge than the value-iteration algorithm. Both value-iteration and policy-
iteration algorithms can be used for offline planning where the agent is assumed to have prior
knowledge about the effects of its actions on the environment (they assume the MDP model is
known). Comparing to each other, policy-iteration is computationally efficient as it often takes
considerably fewer number of iterations to converge although each iteration is more
computationally expensive.

Actor-critic model
1. The “Critic” estimates the value function. This could be the action-value (the Q value) or
state-value (the V value).
2. The “Actor” updates the policy distribution in the direction suggested by the Critic (such as
with policy gradients).
Actor-critic aims to take advantage of all the good stuff from both value-based and policy-based
while eliminating all their drawbacks.
The principal idea is to split the model in two: one for computing an action based on a state and
another one to produce the Q values of the action.
How Actor-critic works
Imagine you play a video game with a friend that provides you some feedback. You’re the Actor
and your friend is the Critic.

Figure 4.10: Working of Actor-citric

At the beginning, you don’t know how to play, so you try some action randomly. The Critic
observes your action and provides feedback.
Learning from this feedback, you’ll update your policy and be better at playing that game.
On the other hand, your friend (Critic) will also update their own way to provide feedback so it
can be better next time.
The idea of Actor Critic is to have two neural networks. We estimate both:
a) ACTOR: A policy function controls how our agent acts.
b) CRITIC: A value function, measures how good these actions are.
Both run in parallel. Because we have two models (Actor and Critic) that must be trained, it
means that we have two set of weights that must be optimized separately.

Q-learning
In the case where the agent does not know apriori i.e., what are the effects of its actions on the
environment (state transition and reward models are not known). The agent only knows what
are the set of possible states and actions, and can observe the environment current state. In this
case, the agent has to actively learn through the experience of interactions with the
environment. There are two categories of learning algorithms:
Model-based learning: In model-based learning, the agent will interact to the environment and
from the history of its interactions; the agent will try to approximate the environment state
transition and reward models. Afterwards, given the models it learnt, the agent can use value-
iteration or policy-iteration to find an optimal policy.
Model-free learning: in model-free learning, the agent will not try to learn explicit models of
the environment state transition and reward functions. However, it directly derives an optimal
policy from the interactions with the environment.
Q-Learning is an example of model-free learning algorithm. It does not assume that agent
knows anything about the state-transition and reward models. However, the agent will discover:
what are the good and bad actions by trial and error.
The basic idea of Q-Learning is to approximate the state-action pairs Q-function from the
samples of Q(s, a) that we observe during interaction with the environment. This approach is
known as Time-Difference Learning.
The Q-learning algorithm Process
Figure 4.11: Q-learning Algorithm Process

SARSA
The SARSA stands for State Action Reward State Action which symbolizes the tuple (s, a, r, s’, a’)
is an On-Policy algorithm for TD-Learning. The major difference between it and Q-Learning, is
that the maximum reward for the next state is not necessarily used for updating the Q-values.
Instead, a new action, and therefore reward, is selected using the same policy that determined
the original action. The name SARSA actually comes from the fact that the updates are done
using the quintuple Q(s, a, r, s', a'). Where: s, a are the original state and action, r is the reward
observed in the following state and s', a' are the new state-action pair.
The procedural form of SARSA algorithm is as follows:
Initialize Q(s, a) arbitrarily
Repeat (for each episode):
Initialize s
Choose a from s using policy derived from Q
(e.g., -greedy)
Repeat (for each step of episode):
Take action a, observe r, s’
Choose a’ from s’ using policy derived from Q
(e.g., -greedy)
Q(s, a) <-- Q(s, a) + α [r + Q(s’, a’) – Q (s, a)]
s <-- s’, a <-- a’;
Until s is terminal;

BC Module 1 (Bcs613a)
No ratings yet
BC Module 1 (Bcs613a)
25 pages
Math Lesson 4 1 - Add With Partial Sums
No ratings yet
Math Lesson 4 1 - Add With Partial Sums
3 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
CNN Concepts for Computer Science Students
No ratings yet
CNN Concepts for Computer Science Students
15 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
21CS743 DL Module4 Notes
No ratings yet
21CS743 DL Module4 Notes
7 pages
Multilevel Memories
No ratings yet
Multilevel Memories
14 pages
CNN Guide for Machine Learning Students
No ratings yet
CNN Guide for Machine Learning Students
37 pages
BScCSIT Transaction DBMS
No ratings yet
BScCSIT Transaction DBMS
30 pages
R22 - IT - Python Programming Lab Manual
No ratings yet
R22 - IT - Python Programming Lab Manual
96 pages
AI Viva Questions and Answers
No ratings yet
AI Viva Questions and Answers
13 pages
Ad3311 - Ai Lab Manual
No ratings yet
Ad3311 - Ai Lab Manual
37 pages
Unit 5 RNN
No ratings yet
Unit 5 RNN
14 pages
Real-Time CNN Fire Detection System
No ratings yet
Real-Time CNN Fire Detection System
2 pages
Unit 2 Cell
No ratings yet
Unit 2 Cell
24 pages
AIML - Module 1-Question Bank
No ratings yet
AIML - Module 1-Question Bank
3 pages
Important Questions of Machine Learning
No ratings yet
Important Questions of Machine Learning
5 pages
Text Processing: Data Structures and Algorithms in Java 1/47
No ratings yet
Text Processing: Data Structures and Algorithms in Java 1/47
47 pages
Big Data Analytics Unit 4
No ratings yet
Big Data Analytics Unit 4
83 pages
Unit-3 Notes
No ratings yet
Unit-3 Notes
6 pages
System Paradigms in NLP
No ratings yet
System Paradigms in NLP
8 pages
@vtucode - in Previous Year Merged Paper Solution Automata
No ratings yet
@vtucode - in Previous Year Merged Paper Solution Automata
42 pages
AI Unit 1.
No ratings yet
AI Unit 1.
15 pages
Toc Mod 5 Notes
No ratings yet
Toc Mod 5 Notes
41 pages
Unit 4
No ratings yet
Unit 4
26 pages
Unit 1
No ratings yet
Unit 1
32 pages
(Ebook) Generative Deep Learning, 2nd Edition (Third Early Release) by David Foster ISBN 9781098134174, 1098134176
No ratings yet
(Ebook) Generative Deep Learning, 2nd Edition (Third Early Release) by David Foster ISBN 9781098134174, 1098134176
81 pages
Deep Learning in Healthcare
100% (1)
Deep Learning in Healthcare
57 pages
DBMS QB
No ratings yet
DBMS QB
12 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Learning Set of Rules
No ratings yet
Learning Set of Rules
11 pages
BCT Techknowledge Want All Subjects Notes Pls
No ratings yet
BCT Techknowledge Want All Subjects Notes Pls
193 pages
Few AIML Lab Viva QA
No ratings yet
Few AIML Lab Viva QA
3 pages
Basic Dfwmac
No ratings yet
Basic Dfwmac
6 pages
Question Bank - Module 2 - Module-3 Module 4 - Module 5
No ratings yet
Question Bank - Module 2 - Module-3 Module 4 - Module 5
4 pages
ML Unit 1
No ratings yet
ML Unit 1
25 pages
A Model For Network Security
No ratings yet
A Model For Network Security
1 page
Convolution Neural Network
No ratings yet
Convolution Neural Network
74 pages
Internship Report
No ratings yet
Internship Report
23 pages
BCS613A Blockchain Technology Model QP SolvedSearch Creators
No ratings yet
BCS613A Blockchain Technology Model QP SolvedSearch Creators
50 pages
Lab Manual: Department of Computer Engineering
No ratings yet
Lab Manual: Department of Computer Engineering
66 pages
CO Unit 1-2
No ratings yet
CO Unit 1-2
14 pages
Python Data Analytics Guide
No ratings yet
Python Data Analytics Guide
99 pages
Case Study: Google File System
No ratings yet
Case Study: Google File System
7 pages
21cs63 CG Vtu
No ratings yet
21cs63 CG Vtu
40 pages
Unit 3 Full Notes
No ratings yet
Unit 3 Full Notes
30 pages
Tentative BTech - CSE 4TH Sem Syllabus 2018-19
No ratings yet
Tentative BTech - CSE 4TH Sem Syllabus 2018-19
26 pages
Intro to Recurrent Neural Networks
No ratings yet
Intro to Recurrent Neural Networks
11 pages
PROJECT REPORT For Machine Learning
100% (1)
PROJECT REPORT For Machine Learning
22 pages
Unit I Dbms
0% (1)
Unit I Dbms
45 pages
Machine Learning Quantum
No ratings yet
Machine Learning Quantum
64 pages
CNS Lab Manual
No ratings yet
CNS Lab Manual
41 pages
04 Influences in Language Design
50% (2)
04 Influences in Language Design
24 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
IOT UNIT 2 Part 2
No ratings yet
IOT UNIT 2 Part 2
34 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Pengertian Conditional Sentence Type 3
No ratings yet
Pengertian Conditional Sentence Type 3
3 pages
Approaches in Language Testing
No ratings yet
Approaches in Language Testing
10 pages
DLL - All Subjects 2 - Q4 - W2 - D4
No ratings yet
DLL - All Subjects 2 - Q4 - W2 - D4
6 pages
Renz Argao, PH.D., Shella Marie Z. Gonzales, BS Psychology, MSW (Ongoing), and Lois J. Engelbrecht, PH.D
No ratings yet
Renz Argao, PH.D., Shella Marie Z. Gonzales, BS Psychology, MSW (Ongoing), and Lois J. Engelbrecht, PH.D
39 pages
Artificial Intelligence (Mahbub Mas'Ud)
No ratings yet
Artificial Intelligence (Mahbub Mas'Ud)
16 pages
Emotional Intelligence for Leaders
No ratings yet
Emotional Intelligence for Leaders
9 pages
Ælfric's Latin Grammar Analysis
No ratings yet
Ælfric's Latin Grammar Analysis
11 pages
1introduction To Language and Linguistics, First Week
No ratings yet
1introduction To Language and Linguistics, First Week
11 pages
Grade 6 English Noun Unit Plan
100% (1)
Grade 6 English Noun Unit Plan
14 pages
Understanding Nonverbal Cues
No ratings yet
Understanding Nonverbal Cues
8 pages
PRACTICUMOptionNo2 211027 203526
No ratings yet
PRACTICUMOptionNo2 211027 203526
2 pages
Structuralism
No ratings yet
Structuralism
12 pages
Research Proposal Guide for Students
No ratings yet
Research Proposal Guide for Students
7 pages
Instrumen Saringan LBI Menulis Tahun 4
No ratings yet
Instrumen Saringan LBI Menulis Tahun 4
13 pages
Pengertian Dan Contoh Kalimat Verb Phrase - Materi Bahasa Inggris PDF
No ratings yet
Pengertian Dan Contoh Kalimat Verb Phrase - Materi Bahasa Inggris PDF
4 pages
Mutya - S ACTION RESEARCH
No ratings yet
Mutya - S ACTION RESEARCH
6 pages
Generating Ideas and Their Relationship
100% (1)
Generating Ideas and Their Relationship
17 pages
Blanko WeeFIM-Brain-Injury
No ratings yet
Blanko WeeFIM-Brain-Injury
3 pages
Jakobsonian's and Chomskyan's Distinctive Feature of Phonology
100% (1)
Jakobsonian's and Chomskyan's Distinctive Feature of Phonology
4 pages
Subjectverbagreement
No ratings yet
Subjectverbagreement
19 pages
Quirino State University: Self-Paced Learning Module
No ratings yet
Quirino State University: Self-Paced Learning Module
18 pages
IBM Article On AI
No ratings yet
IBM Article On AI
10 pages
Game Theory: Managerial Decision Process
No ratings yet
Game Theory: Managerial Decision Process
9 pages
Neuro Unit 7-9 Notes
No ratings yet
Neuro Unit 7-9 Notes
52 pages
Year 10 Characterisation
100% (1)
Year 10 Characterisation
3 pages
Typological Analysis of Negation in Sa ̰ Wi, Kɔdɛ Adiukru & Alladian
No ratings yet
Typological Analysis of Negation in Sa ̰ Wi, Kɔdɛ Adiukru & Alladian
14 pages
Rubric For Group Activity
100% (1)
Rubric For Group Activity
3 pages
Kinds of Sentences Sentence Pattern
No ratings yet
Kinds of Sentences Sentence Pattern
2 pages
EAPP Reviewer
100% (1)
EAPP Reviewer
2 pages

Machine Learning: RNN, LSTM, GRU, Translation

Uploaded by

Machine Learning: RNN, LSTM, GRU, Translation

Uploaded by

Chameli Devi Group of Institutions, Indore

Department of Computer Science and Engineering

Recurrent neural network

Figure 4.1: Recurrent neural network

Long short-term memory

Figure 4.3: Forget gate

Figure 4.4: Input gate

Figure 4.5: Output gate

Gated recurrent unit

Figure 4.6: Gated recurrent unit

Beam search and width

where, pn is the Bleu score on n-gram

Figure 4.7: Attention Model

Figure 4.8: Reinforcement Learning

MDP (Markov decision process)

Here are the most important parts:

= 𝔼 [Rt+1 + 𝛾Rt+2 + 𝛾2Rt+3 + … | St = s]

Value Iteration and Policy Iteration

Figure 4.10: Working of Actor-citric

You might also like