0% found this document useful (0 votes)

8 views61 pages

Module 7 - Deep Sequence Modeling

lecture sheet

Uploaded by

omarfaroque910

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views61 pages

Module 7 - Deep Sequence Modeling

lecture sheet

Uploaded by

omarfaroque910

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Module 7 – Part I

DNN vs RNN for Timeseries

Intuition

Standard NN
Recurrent
NN

Dr. Pedram Jahangiry

Road map!
• Module 1- Introduction to Deep Forecasting
• Module 2- Setting up Deep Forecasting Environment
• Module 3- Exponential Smoothing
• Module 4- ARIMA models
• Module 5- Machine Learning for Time series Forecasting
• Module 6- Deep Neural Networks
• Module 7- Deep Sequence Modeling (RNN, LSTM)
• Module 8- Prophet and Neural Prophet

Dr. Pedram Jahangiry

What is Sequence Data?
• Sequence data refers to any data that has a specific order or sequence to it!

Sequence Data

Time Series
Text Data Audio Data Video Data
Data

Regular TS

Irregular TS

Dr. Pedram Jahangiry

Time series Tasks
TS tasks

Anomaly/Event
Forecasting Classification Clustering
Detection

Qualitative

Quantitative

Dr. Pedram Jahangiry

Understanding DNNs & RNNs for Time Series Forecasting

• Comparing feed-forward networks with sequential models

• Key ideas: data transformation, memory

Standard NN
Recurrent
NN

Dr. Pedram Jahangiry

Feature Engineering in DNN
• Use lagged values (e.g., 12 lags) as independent input features
• Each observation: vector of 12 features representing consecutive time steps

Dr. Pedram Jahangiry

It’s All about shapes! DNN

Dr. Pedram Jahangiry

It’s All about shapes! DNN {

Dr. Pedram Jahangiry

Batch Training in DNNs
• Data is shuffled into batches (e.g., batch size = 16)
• Temporal order among different observations is lost
• Within each observation, the sequential order of lags is preserved

Dr. Pedram Jahangiry

Learning in DNNs
• Model learns to map fixed windows of past values to a target
• Temporal relationships are implicitly modeled through feature patterns

Dr. Pedram Jahangiry

How RNNs Process Time Series Data
• Sequential Processing:
• Input is the time series itself (one feature)
• RNN unrolls over a sequence (e.g., sequence length = 12)

Recurrent
NN

Dr. Pedram Jahangiry

How RNNs Process Time Series Data
• Hidden State Mechanism:
• Hidden state carries information from previous time steps
• Explicitly models temporal dependencies

Recurrent
NN

Dr. Pedram Jahangiry

How RNNs Process Time Series Data
• Key Difference from DNNs:
• DNNs treat lagged inputs as independent features
• RNNs connect time steps via hidden states, preserving order

Standard NN

Recurrent
NN

Dr. Pedram Jahangiry

Memory in RNNs
• Sequence Length:
• Sets the potential memory span/length (how many past time steps are seen)
• Limited by practical issues (e.g., vanishing gradients limit long-term retention)
• Hidden State Size:
• Determines the capacity/depth of the memory
• Larger hidden states can capture richer, more complex patterns

Dr. Pedram Jahangiry

Feature Engineering in RNN?

Dr. Pedram Jahangiry

It’s All about shapes! RNN

Dr. Pedram Jahangiry

All about shapes!

Dr. Pedram Jahangiry

Key Comparisons (DNN vs RNN)
• DNN:
• Simple and effective for short-term dependencies via engineered features
• Uses engineered lagged features as independent inputs
• Shuffling within batches loses order between samples

• RNN:
• Designed for sequential data
• Processes sequences one time step at a time with a hidden state
• Explicitly captures the order and dependencies in the data
• Memory is influenced by both sequence length and hidden state size

Dr. Pedram Jahangiry

RNN performance (raw data vs pre-processed data)

Dr. Pedram Jahangiry

Module 7 – Part II
Deep Sequence Modeling
Recurrent Neural Networks (RNN)

Dr. Pedram Jahangiry

Sequence Modeling
To model sequence data efficiently, we need a new architecture that:
• Preserve the order
• Account for long-term dependencies
• Handle input-length
• Share parameters across the sequence

Dr. Pedram Jahangiry

What is RNN (Recurrent Neural Network)?

• The architecture of RNNs is inspired by the way biological

intelligence processes information incrementally while
maintaining an internal model of what it is processing.

• This ability to remember previous inputs and incorporate them

into the current output allows RNNs to model sequential data.

• RNN maintains a state that contains information relative to

what it has seen so far

• RNNs can be thought of as neural networks with an internal

loop, which allows them to process sequences of varying
lengths and learn from temporal dependencies.

Dr. Pedram Jahangiry

Perceptron vs Recurrent Cell

Perceptron Recurrent Cell

Dr. Pedram Jahangiry

Unrolling the Recurrent Cell

𝑦ො0 𝑦ො1 𝑦ො𝑡 output

Z|𝜎 Z|𝜎 ... Z|𝜎 RNN

𝑋0 𝑋1 𝑋𝑡 input

Dr. Pedram Jahangiry

Dense Layer vs Recurrent Layer

Dense Layer Recurrent Layer

Dr. Pedram Jahangiry

Inside the Recurrent Cell
𝑜𝑢𝑡𝑝𝑢𝑡𝑡 = 𝑓 𝑖𝑛𝑝𝑢𝑡𝑡 , 𝑆𝑡𝑎𝑡𝑒𝑡

𝑊 𝑊 𝑊

𝑈 𝑈 𝑈

𝑠𝑡+1 = 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑊𝑋𝑡 + 𝑈𝑠𝑡 + 𝑏

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

RNN architectures

Not Aligned

Dr. Pedram Jahangiry

How does RNN learn representations?
• Backpropagation Through Time (BPTT)
𝜕𝐽
• P are the parameters
𝜕𝑃
𝜕𝐽 𝜕𝐽0 𝜕𝐽1 Total Loss (𝐽)
• = + + ..
𝜕𝑊 𝜕𝑊 𝜕𝑊
𝜕𝐽0 𝜕𝐽0 𝜕𝑦0 𝜕𝑆0
• = 𝐽0 𝐽1 𝐽𝑡
𝜕𝑊 𝜕𝑦0 𝜕𝑆0 𝜕𝑊
𝜕𝐽1 𝜕𝐽1 𝜕𝑦1 𝜕𝑆1 𝜕𝑆1 𝜕𝑆1 𝜕𝑆0
• = , = 𝑦ො0 𝑦ො1 𝑦ො𝑡
𝜕𝑊 𝜕𝑦1 𝜕𝑆1 𝜕𝑊 𝜕𝑊 𝜕𝑆0 𝜕𝑊
𝑆1 𝑆2
• … 𝑆𝑡
Z|𝜎 Z|𝜎 ... Z|𝜎
𝜕𝐽𝑡 𝜕𝐽 𝜕𝑦 𝜕𝑆 𝜕𝑆
• = σ𝑡𝑘=0 𝑡 𝑡 𝑡 𝑘
𝜕𝑊 𝜕𝑦𝑡 𝜕𝑆𝑡 𝜕𝑆𝑘 𝜕𝑊 𝑊 𝑊 𝑊
𝑋0 𝑋1 𝑋𝑡

Dr. Pedram Jahangiry

Vanishing Gradient Problem
• As the time horizon gets bigger, this product gets longer and longer.
• We are multiplying a lot of small numbers → smaller gradients → biased parameters
unable to capture long term dependencies.

𝜕𝐽𝑡 𝜕𝐽𝑡 𝜕𝑦𝑡 𝜕𝑆𝑡 𝜕𝑆𝑘

• = σ𝑡𝑘=0
𝜕𝑊 𝜕𝑦𝑡 𝜕𝑆𝑡 𝜕𝑆𝑘 𝜕𝑊 𝑆𝑡 = 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑊𝑋𝑡_1 + 𝑈𝑠𝑡−1
𝜕𝑆10 𝜕𝑆10 𝜕𝑆9 𝜕𝑆8 𝜕𝑆7 𝜕𝑆1
• = …
𝜕𝑆0 𝜕𝑆9 𝜕𝑆8 𝜕𝑆7 𝜕𝑆6 𝜕𝑆0
𝑆1 𝑆2
𝑆𝑡
...
𝑊 𝑊 𝑊
𝑋0 𝑋1 𝑋𝑡

Dr. Pedram Jahangiry

A simple timeseries with multiple features example
• A temperature forecasting example: deep-learning-with-python-notebooks
• Predicting the temperature 24 hours in the future
• Target: temperature
• Features: 14 different variables including pressure, humidity, wind direction and etc
• Data recorded every 10 minutes from 2009-2016

Temperature between 2009-2016 Temperature in the first 10 days: 10246 = 1440

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Preparing the data
• Given the previous 5 days (120 hours) and samples once per hour, can we predict
temperature in 24 hours (after the end of the sequence)?
• Data batches:
• Sequence length = 120
• [1,2,3,…,120][144]
• [2,3,4,…,121][145]
• [3,4,5,…,122][146]
• Bath size: 256 of these samples are shuffled and batched
• Sample shape: (256, 120, 14)
• Target shape: (256,)

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Naïve forecaster: common-sense baseline
• Temperature 24 hours from now = Temperature right now
• This is our random walk with no drift forecaster.

• Performance:
• Validation MAE = 2.44 degrees Celsius
• Test MAE = 2.62 degrees Celsius
• The baseline model is off by about 2.5 degrees on average. Not bad!!

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Let’s try DNN (Deep Neural Networks)

• Test MAE = 2.62 degrees Celsius

• No improvement!!
• Flattening a timeseries data is not a
good idea!

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Let’s try CNN (Convolutional Neural Networks)

• Motivation: Maybe a temporal convnet could reuse the same representations across
different days, much like a spatial convnet can reuse the same representations across
different locations in an image!

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

CNN performance
• Test MAE = 3.10 degrees Celsius
• Even worse than the densely connected model!!
• CNN treats every segment of the data the same way!
• Pooling layers are destroying order information.

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Let’s try a simple RNN

• Baseline Test MAE = 2.62

• Simple RNN Test MAE = 2.51
• beats the naïve forecaster.

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Beyond RNN
RNN can handle the following sequence modeling criteria:
• Preserve the order
• Handle input-length
• Share parameters across the sequence

RNN limitations:
• Does not account for long-term dependencies (only remember short term history )
• Vanishing Gradient Problem

Dr. Pedram Jahangiry

Module 7 – Part III
Deep Sequence Modeling
(Gated cells, LSTM)

Dr. Pedram Jahangiry

Beyond RNN
RNN can handle the following sequence modeling criteria:
• Preserve the order
• Handle input-length
• Share parameters across the sequence

RNN limitations:
• Does not account for long-term dependencies (only remember short term history )
• Vanishing Gradient Problem

Dr. Pedram Jahangiry

How to solve vanishing gradient problem
1. Use Activation Function that prevents fast shrinkage of gradient

𝑆𝑡 = 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑊𝑋𝑡−1 + 𝑈𝑠𝑡−1

Dr. Pedram Jahangiry

How to solve vanishing gradient problem
1. Use Activation Function that prevents fast shrinkage of gradient
2. Use weight initialization techniques that ensure that the initial weights are not too small
3. Use gradient clipping which limits the magnitude of the gradients from becoming too
small (vanishing gradient) or too large (exploding gradient)
4. Use batch normalization, which normalizes the input to each layer and helps to reduce the
range of activation values and thus the likelihood of vanishing gradients.
5. Use a different optimization algorithm that is more resilient to vanishing gradients, such
as Adam or RMSprop.
6. Gated cells: Use some sort of skip connections, which allow gradients to bypass some
of the layers in the network and thus prevent them from becoming too small.

Dr. Pedram Jahangiry

Gated cells
• Instead of using a simple RNN cell, let’s use a more complex cell with gates which
control the flow of information.
• Think of a conveyer belt running parallel to the sequence being processed:
• Information can jump on → transported to a later timestep → jump off when needed.
• This is what a gated cell does! Analogous to residual connections we saw before.

• Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are two examples
of gated cells that can keep track of information throughout many timesteps.

Dr. Pedram Jahangiry

Inside the LSTM cell

Simple RNN

Vs
Output

Long term memory: Carry track: 𝐶𝑡

𝐶𝑡−1
Forget Input
Output
irrelevant relevant
filtered
info from info and
version of
previous selectively
the cell state
state update cell
Short term state New short
memory: ℎ𝑡−1 term memory:
Input ℎ𝑡

Dr. Pedram Jahangiry

LSTM details
𝒊𝒕 𝒐𝒕 𝑺𝒕

𝑪𝒕
𝑪𝒕−𝟏

𝒇𝒕

𝒉𝒕−𝟏
𝒉𝒕

𝑿𝒕

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry
Dr. Pedram Jahangiry
Dr. Pedram Jahangiry
Dr. Pedram Jahangiry
LSTM takeaway
• LSTM uses gates to regulate the information flow (allows past information to be
reinjected later)
• This new cell state (carry) can better capture longer term dependencies
• LSTM fights the vanishing gradient problem

Dr. Pedram Jahangiry

Let’s try LSTM on the temperature example

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Temperature between 2009-2016 Temperature in the first 10 days: 10246 = 1440

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Let’s try LSTM on the temperature example

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

LSTM performance
• Baseline Test MAE = 2.62
• Simple LSTM Test MAE = 2.53
• Also beats the naïve forecaster.
• Overfitting?

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Can we do better?

Dr. Pedram Jahangiry

Improving the simple LSTM model
• We can improve the performance of the simple LSTM
model by:
1. Recurrent Dropout : use drop out to fight overfitting in the
recurrent layers (in addition to drop out for the dense
layers)
2. Stacking recurrent layers: increase model complexity to
boost representation power
3. Using bidirectional RNN: processing the same information
differently! Mostly used in NLP.

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Regular vs Recurrent Dropout

Dr. Pedram Jahangiry

Recurrent Drop out
• The same dropout pattern should be applied at every timestep

• Baseline Test MAE = 2.62

• Simple RNN, Test MAE = 2.51
• Simple LSTM, Test MAE = 2.53
• LSTM with dropout, Test MAE = 2.45

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Stacking Recurrent Layers
• Let’s train a dropout-regulated, stacked GRU model.
• GRU is a slightly simpler version (hence, faster) of LSTM architecture

• Baseline Test MAE = 2.62

• Simple RNN, Test MAE = 2.51
• Simple LSTM, Test MAE = 2.53
• Stacking GRU, Test MAE = 2.39

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Bidirectional RNN
• Bidirectional RNN process the input sequence both
chronologically and antichronologically.
• Idea: capturing patterns (representations) that might be
overlooked by a unidirectional RNN.

• For the temperature example, the bidirectional LSTM

strongly underperforms even the common-sense • Baseline Test MAE = 2.62
baseline.
• Simple RNN, Test MAE = 2.51

• Simple LSTM, Test MAE = 2.53

• Stacking GRU, Test MAE = 2.39

• Bidirectional RNN, Test MAE= 2.79

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Final message
• Deep learning is more an art than science! Too many moving part!
• Number of units in each recurrent layer
• Number of stacked layers
• Amount of dropout and recurrent dropout
• Number of dense layers
• Sequence horizon!
• Optimizers, learning rates and etc
• ….
• Apply RNN to datasets that past is a good predictor of the future! Not the stock market!

Dr. Pedram Jahangiry

Road map!
✓ Module 1- Introduction to Deep Forecasting
✓ Module 2- Setting up Deep Forecasting Environment
✓ Module 3- Exponential Smoothing
✓ Module 4- ARIMA models
✓ Module 5- Machine Learning for Time series Forecasting
✓ Module 6- Deep Neural Networks
✓ Module 7- Deep Sequence Modeling (RNN, LSTM)
• Module 8- Prophet and Neural Prophet

Dr. Pedram Jahangiry

Module 6 - Deep Sequence Modeling-Original
No ratings yet
Module 6 - Deep Sequence Modeling-Original
65 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
Recurrent Neural Networks (RNNS)
No ratings yet
Recurrent Neural Networks (RNNS)
45 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
63 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
No ratings yet
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
9 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Module 4 Recurrent Neural Network
100% (1)
Module 4 Recurrent Neural Network
78 pages
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
No ratings yet
Deep Learning - AD3501 - Notes - Unit 3 - Recurrent Neural Networks
29 pages
RNNs: Design, Advantages, and Challenges
No ratings yet
RNNs: Design, Advantages, and Challenges
30 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
RNN Simplified.
No ratings yet
RNN Simplified.
2 pages
Soft Computing 1
No ratings yet
Soft Computing 1
15 pages
RNN
No ratings yet
RNN
23 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
Module 4
No ratings yet
Module 4
36 pages
AD3501 DL UNIT 3 Notes - Nil AD3501 DL UNIT 3 Notes - Nil
No ratings yet
AD3501 DL UNIT 3 Notes - Nil AD3501 DL UNIT 3 Notes - Nil
31 pages
Recurrent Neural Networks: Anahita Zarei, PH.D
No ratings yet
Recurrent Neural Networks: Anahita Zarei, PH.D
37 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
Build RNN with Numpy: Step-by-Step Guide
No ratings yet
Build RNN with Numpy: Step-by-Step Guide
36 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
DL Unit-4
No ratings yet
DL Unit-4
31 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
Dis6 Sol
No ratings yet
Dis6 Sol
6 pages
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
No ratings yet
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
16 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Time Series RNN LSTM 1746197734
No ratings yet
Time Series RNN LSTM 1746197734
25 pages
Unit 4b - Recurrent Neural Networks
No ratings yet
Unit 4b - Recurrent Neural Networks
60 pages
RNNs
No ratings yet
RNNs
22 pages
Advanced Deep Learning with RNNs
No ratings yet
Advanced Deep Learning with RNNs
50 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
34 pages
Lesson 7 - RNN
No ratings yet
Lesson 7 - RNN
89 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
Lab 9 RNN
No ratings yet
Lab 9 RNN
8 pages
DL Unit Iv
No ratings yet
DL Unit Iv
15 pages
A Brief Overview of Recurrent Neural Networks (RNN)
No ratings yet
A Brief Overview of Recurrent Neural Networks (RNN)
8 pages
Lec 4 Recurrent Neural Network Long Short-Term Memory
No ratings yet
Lec 4 Recurrent Neural Network Long Short-Term Memory
32 pages
Unit 4
No ratings yet
Unit 4
50 pages
Unit 4
No ratings yet
Unit 4
34 pages
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
No ratings yet
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
5 pages
RNN and LSTM Concepts Explained
No ratings yet
RNN and LSTM Concepts Explained
128 pages
Bianchi
No ratings yet
Bianchi
62 pages
RNN Overview: Types, Applications, and Code
No ratings yet
RNN Overview: Types, Applications, and Code
8 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
Module 06
No ratings yet
Module 06
5 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
Stock Prediction with RNN and LSTM
0% (1)
Stock Prediction with RNN and LSTM
24 pages
Top 25 Interview Questions On RNN - Reader View
No ratings yet
Top 25 Interview Questions On RNN - Reader View
9 pages
Unit 3 Questions With Answers Ghanta Ka Password
No ratings yet
Unit 3 Questions With Answers Ghanta Ka Password
20 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Lecture Notes - Recurrent Neural Networks
No ratings yet
Lecture Notes - Recurrent Neural Networks
11 pages
Unit IV
No ratings yet
Unit IV
31 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
Semster - DL
No ratings yet
Semster - DL
15 pages
Cse 4006 RNN
No ratings yet
Cse 4006 RNN
4 pages
ch6 RNN
No ratings yet
ch6 RNN
25 pages
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
No ratings yet
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
12 pages
Class 10 - Logistic Regression-Checkpoint
No ratings yet
Class 10 - Logistic Regression-Checkpoint
20 pages
Class 5 - ML Concepts (Part II)
No ratings yet
Class 5 - ML Concepts (Part II)
17 pages
Covid-19 Detection From Chest X-Tay Images
No ratings yet
Covid-19 Detection From Chest X-Tay Images
12 pages
SQLforEveryone1 1
No ratings yet
SQLforEveryone1 1
10 pages
06 BBMD
No ratings yet
06 BBMD
7 pages
Chapter Shutdown
No ratings yet
Chapter Shutdown
31 pages
Advanced Transducers & Data Loggers
No ratings yet
Advanced Transducers & Data Loggers
6 pages
Philippine Digitalization Bills
No ratings yet
Philippine Digitalization Bills
13 pages
718-Article Text-2170-1-10-20240113
No ratings yet
718-Article Text-2170-1-10-20240113
18 pages
19-2G0017 - Perf Curves
No ratings yet
19-2G0017 - Perf Curves
1 page
Mobile Application User Guide
No ratings yet
Mobile Application User Guide
13 pages
AIML UNIT I Notes
No ratings yet
AIML UNIT I Notes
68 pages
An Introduction To American Law Third Edition Ebook and TestBank Bundle Unlocked Test Bank
No ratings yet
An Introduction To American Law Third Edition Ebook and TestBank Bundle Unlocked Test Bank
319 pages
Web Practical
No ratings yet
Web Practical
37 pages
Properties and Classifications of Bamboo For Const
No ratings yet
Properties and Classifications of Bamboo For Const
11 pages
TCS NQT Prep Guide
No ratings yet
TCS NQT Prep Guide
156 pages
Linux Kernel Module Basics
No ratings yet
Linux Kernel Module Basics
35 pages
CEng 6104-Course Outline March 2023
No ratings yet
CEng 6104-Course Outline March 2023
2 pages
Visa Cashless Cities Report
No ratings yet
Visa Cashless Cities Report
68 pages
Effective Executive Summary by Drucker
No ratings yet
Effective Executive Summary by Drucker
10 pages
Ambo University Exam System
No ratings yet
Ambo University Exam System
44 pages
1947 Benscoter, Stanley PDF
No ratings yet
1947 Benscoter, Stanley PDF
258 pages
20I6001 - Aaradhya - Ghota - Assignment 1
No ratings yet
20I6001 - Aaradhya - Ghota - Assignment 1
20 pages
Akvárium Klub Ticket Guidelines
No ratings yet
Akvárium Klub Ticket Guidelines
1 page
Gemcom Minex: New Features
No ratings yet
Gemcom Minex: New Features
13 pages
SH - Fall of Troy Semi Fiction PDF
No ratings yet
SH - Fall of Troy Semi Fiction PDF
11 pages
FUTM-EET 111 Courseware
100% (1)
FUTM-EET 111 Courseware
4 pages
Economics Thesis Blue Variant
No ratings yet
Economics Thesis Blue Variant
38 pages
Airtech Busch Parts
No ratings yet
Airtech Busch Parts
7 pages
GOT2000 Connection Manual ENG
No ratings yet
GOT2000 Connection Manual ENG
388 pages
FULL PreSonus Studio One 4 Professional 411 MULTILANG x64 PDF
No ratings yet
FULL PreSonus Studio One 4 Professional 411 MULTILANG x64 PDF
4 pages
Self-Disclosure on WeChat: A Study
No ratings yet
Self-Disclosure on WeChat: A Study
12 pages
2024 CISA Study Text
No ratings yet
2024 CISA Study Text
330 pages
Digital Tech for Experts
No ratings yet
Digital Tech for Experts
8 pages

Module 7 - Deep Sequence Modeling

Uploaded by

Module 7 - Deep Sequence Modeling

Uploaded by

Module 7 – Part I

DNN vs RNN for Timeseries

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

• Comparing feed-forward networks with sequential models

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

• The architecture of RNNs is inspired by the way biological

• This ability to remember previous inputs and incorporate them

• RNN maintains a state that contains information relative to

• RNNs can be thought of as neural networks with an internal

Dr. Pedram Jahangiry

Perceptron Recurrent Cell

Dr. Pedram Jahangiry

𝑦ො0 𝑦ො1 𝑦ො𝑡 output

Z|𝜎 Z|𝜎 ... Z|𝜎 RNN

Dr. Pedram Jahangiry

Dense Layer Recurrent Layer

Dr. Pedram Jahangiry

𝑠𝑡+1 = 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑊𝑋𝑡 + 𝑈𝑠𝑡 + 𝑏

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

𝜕𝐽𝑡 𝜕𝐽𝑡 𝜕𝑦𝑡 𝜕𝑆𝑡 𝜕𝑆𝑘

Dr. Pedram Jahangiry

Temperature between 2009-2016 Temperature in the first 10 days: 10*24*6 = 1440

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

• Test MAE = 2.62 degrees Celsius

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

• Baseline Test MAE = 2.62

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

𝑆𝑡 = 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑊𝑋𝑡−1 + 𝑈𝑠𝑡−1

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Long term memory: Carry track: 𝐶𝑡

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Temperature between 2009-2016 Temperature in the first 10 days: 10*24*6 = 1440

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

Dr. Pedram Jahangiry

• Baseline Test MAE = 2.62

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

• Baseline Test MAE = 2.62

Dr. Pedram Jahangiry Deep learning with Python, Francois Chollet

• For the temperature example, the bidirectional LSTM

Temperature between 2009-2016 Temperature in the first 10 days: 10246 = 1440

Temperature between 2009-2016 Temperature in the first 10 days: 10246 = 1440