Deep Learning
Introduction
• Deep learning is a subset of machine learning.
Difference between DL and ML
• https://www.youtube.com/watch?v=6M5VXKLf4D4
Applications of DL
• Virtual Assistants: Siri, Alexa, Cortana etc.
• Recommendation Engines
• News aggregator and Fake news detector
• Music Composition
• Image Captioning
• Self Driving Cars
• Language Translators
Types of DL Models
• Supervised Models
• CNN (Convolutional Neural Network)
• Image Classification
• RNN (Recurrent Neural Network)
• To Predict Sequences
• LSTM one of the popular techniques.
Disadvantages of Feed Forward NN.
• Cannot Handle Sequential Data.
• Considers only the current input
• Cannot memorize the previous inputs.
• Solution : RNN.
RNN
• Designed to work for Time series data or the data that involves
sequences.
• Sequence: One data point is dependent upon the previous data point.
• Concept of Memory involved.
Applications of RNN
• Image
Captioning
• Text Prediction
• Machine Translation
• Time Series Prediction
RNN
Types of RNN
• One to One RNN/ • One to Many RNN
Vanilla • Image Captioning
• Many to One
• Many to Many
• E.g.: Sentiment • E.g.: Machine
Analysis Translation
Problems with RNN
• Vanishing Gradient
• When the derivative function is sigmoid.
• Derivative of sigmoid function value lies between 0-0.25.
• The gradient becomes smaller and smaller
• Over a period of time the old and new weights will become same.
• Exploding Gradient
• When gradient becomes very high and is not able to converge.
Way to deal with Gradient Problem
• LSTM (LongShort- Term Memory)
• LSTM capable of learning long term
dependencies by remembering
information for long periods of time.
LSTM Basics
• NLP task for auto completion
• Today, due to health issue, I ….
• Last week, due to health issue, I…..
• I need medication
• I had taken medication
• Need to remember long history…
• But, RNN has short term memory
Another Example
• Maya loves eating samosa everyday. Her favorite cuisine is ……….
Memory Cell
Building Long Term Memory
Only Keywords are stored.
Long Term Memory
• Maya loves samosa, her favorite cuisine is Indian. Rahul loves pizza
and pasta, his favorite cuisine is …….
• Forget some things and remember something.
• This is forget gate.
• Input Gate
• Add memory for Pasta
Output Gate
LSTM Architecture
• Consist of 3 parts
Called Gates
1: Forget Gate
2. Input Gate
3. Output Gate
3 Step Process of LSTM
• Decide how much past data should be
remembered (FORGET GATE)
• Let output of h(t-1) be “ Rita is good in Maths.
Jimmy on the other hand is good at Biology”
• Output at x(t) be “ Jimmy called me yesterday. He
loves to play football. He is selected as the captain
of his team”
• The forget gate realizes there might be a change in
context after encountering the first full stop.
• It compares with the current input sentence at
x(t). The next sentence talks about Jimmy, so the
information on Rita is deleted.
• The position of the subject is vacated and assigned
to Jimmy.
Step 2
• What information we need to store in the
cell?
• First, Sigmoid Layer (Input Gate Layer)
decides which values will get updated.
• Next tanh creates a vector of new
candidate values that could be added to
the state.
• Next step we combine these two and
create an update to the state.
• With the current input at x(t), the input gate analyzes
the important information in Jimmy called me yesterday. He
loves to play football. He is selected as the captain of his team”
• The information “ He loves of play football. He is
selected as the captain of his team” is remembered
• “He called me yesterday” is less important; hence it's
forgotten. This process of adding some new information
can be done via the input gate.
Step 3
• Decide What Part of the Current Cell State Makes It to
the Output
• First, we run a sigmoid layer, which decides what parts
of the cell state make it to the output.
• Then, we put the cell state through tanh to push the
values to be between -1 and 1 and multiply it by the
output of the sigmoid gate.
• E.g: “Jimmy played well against the opponent. Brave
_____ was awarded player of the match”
• For empty place many choices, many options.
• Brave Adjective, choice of noun.
• The best choice is John.
• https://colah.github.io/posts/2015-08-Understanding-
LSTMs/
Deep Learning Frameworks
• TensorFlow
• Keras
• PyTorch
• Theano
• Caffe
• Microsoft CNTK
TensorFlow
• Google’s Brain team developed Deep Learning Framework
TensorFlow.
• Supports Python and R
• Uses dataflow graphs to process data.
• Easy to build robust models
• TensorBoard available for Data Visualization
Keras
• Francois Chollet developed Keras and it is now one of the fastest
growing Deep Learning framework packages.
• Supports high level NN API, written in Python
• Can run on top of TensorFlow, Theano and CNTK.
• User friendly, as API is simple.
• Extensible as new modules are simple to add
• TensorFlow has adopted Keras as its official high level API
• Used by companies like Netflix, Uber etc.
Keras