0% found this document useful (0 votes)

12 views3 pages

LSTM

Long Short-Term Memory (LSTM) is an advanced type of Recurrent Neural Network (RNN) that effectively captures long-term dependencies in sequential data, making it suitable for applications like language translation and time series forecasting. LSTMs utilize a memory cell and three gates (input, forget, and output) to manage information flow, addressing issues like vanishing and exploding gradients that hinder traditional RNNs. Variants such as Bidirectional LSTMs enhance performance by processing data in both forward and backward directions, and LSTMs are widely used in various fields including speech recognition, anomaly detection, and video analysis.

Uploaded by

[CO - 174] Shubham Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views3 pages

LSTM

Uploaded by

[CO - 174] Shubham Mourya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

What is LSTM – Long Short Term Memory?

Last Updated : 05 Apr, 2025

Long Short-Term Memory (LSTM) is an enhanced version of the Recurrent Neural Network
(RNN) designed by Hochreiter & Schmidhuber. LSTMs can capture long-term dependencies in sequential
data making them ideal for tasks like language translation, speech recognition and time series
forecasting.

Unlike traditional RNNs which use a single hidden state passed through time LSTMs introduce a memory
cell that holds information over extended periods addressing the challenge of learning long-term
dependencies.

Problem with Long-Term Dependencies in RNN

Recurrent Neural Networks (RNNs) are designed to handle sequential data by maintaining a hidden state
that captures information from previous time steps. However they often face challenges in learning long-
term dependencies where information from distant time steps becomes crucial for making accurate
predictions for current state. This problem is known as the vanishing gradient or exploding gradient
problem.

Vanishing Gradient: When training a model over time, the gradients (which help the model learn)
can shrink as they pass through many steps. This makes it hard for the model to learn long-term
patterns since earlier information becomes almost irrelevant.
Exploding Gradient: Sometimes, gradients can grow too large, causing instability. This makes it
difficult for the model to learn properly, as the updates to the model become erratic and unpredictable.

Both of these issues make it challenging for standard RNNs to effectively capture long-term
dependencies in sequential data.

LSTM Architecture
LSTM architectures involves the memory cell which is controlled by three gates: the input gate, the forget
gate and the output gate. These gates decide what information to add to, remove from and output from
the memory cell.

Input gate: Controls what information is added to the memory cell.

Forget gate: Determines what information is removed from the memory cell.
Output gate: Controls what information is output from the memory cell.

This allows LSTM networks to selectively retain or discard information as it flows through the network
which allows them to learn long-term dependencies. The network has a hidden state which is like its
short-term memory. This memory is updated using the current input, the previous hidden state and the
current state of the memory cell.

Working of LSTM
LSTM architecture has a chain structure that contains four neural networks and different memory blocks
called cells.
LSTM Model

Information is retained by the cells and the memory manipulations are done by the gates. There are
three gates –

Forget Gate

The information that is no longer useful in the cell state is removed with the forget gate. Two
inputs xt (input at the particular time) and ht-1 (previous cell output) are fed to the gate and multiplied with
weight matrices followed by the addition of bias. The resultant is passed through an activation function
which gives a binary output. If for a particular cell state the output is 0, the piece of information is
forgotten and for output 1, the information is retained for future use.

The equation for the forget gate is:

ft = σ(Wf ⋅ [ht−1 , xt ] + bf )

where:

W_f represents the weight matrix associated with the forget gate.
[h_t-1, x_t] denotes the concatenation of the current input and the previous hidden state.
b_f is the bias with the forget gate.
σ is the sigmoid activation function.

Forget Gate

Input gate

The addition of useful information to the cell state is done by the input gate. First, the information is
regulated using the sigmoid function and filter the values to be remembered similar to the forget gate
using inputs ht-1 and xt. . Then, a vector is created using tanh function that gives an output from -1 to +1,
which contains all the possible values from ht-1 and xt. At last, the values of the vector and the regulated
values are multiplied to obtain the useful information. The equation for the input gate is:
it = σ(Wi ⋅ [ht−1 , xt ] + bi )

C^t = tanh(Wc ⋅ [ht−1 , xt ] + bc )

We multiply the previous state by ft, disregarding the information we had previously chosen to ignore.
Next, we include it∗Ct. This represents the updated candidate values, adjusted for the amount that we
chose to update each state value.
Ct = ft ⊙ Ct−1 + it ⊙ C^t

where

⊙ denotes element-wise multiplication

tanh is tanh activation function

Input Gate
Output gate

The task of extracting useful information from the current cell state to be presented as output is done by
the output gate. First, a vector is generated by applying tanh function on the cell. Then, the information is
regulated using the sigmoid function and filter by the values to be remembered using inputs ht−1 and xt . At

last, the values of the vector and the regulated values are multiplied to be sent as an output and input to
the next cell. The equation for the output gate is:
ot = σ(Wo ⋅ [ht−1 , xt ] + bo )

Output Gate

Bidirectional LSTM Model

Bidirectional LSTM (Bi LSTM/ BLSTM) is a variation of normal LSTM which processes sequential data in
both forward and backward directions. This allows Bi LSTM to learn longer-range dependencies in
sequential data than traditional LSTMs which can only process sequential data in one direction.

Bi LSTMs are made up of two LSTM networks one that processes the input sequence in the forward
direction and one that processes the input sequence in the backward direction.
The outputs of the two LSTM networks are then combined to produce the final output.

LSTM models including Bi LSTMs have demonstrated state-of-the-art performance across various
tasks such as machine translation, speech recognition and text summarization.

LSTM networks can be stacked to form deeper models allowing them to learn more complex patterns in
data. Each layer in the stack captures different levels of information and time-based relationships in the
input.

Applications of LSTM
Some of the famous applications of LSTM includes:

Language Modeling: Used in tasks like language modeling, machine translation and text
summarization. These networks learn the dependencies between words in a sentence to generate
coherent and grammatically correct sentences.
Speech Recognition: Used in transcribing speech to text and recognizing spoken commands. By
learning speech patterns they can match spoken words to corresponding text.
Time Series Forecasting: Used for predicting stock prices, weather and energy consumption.
They learn patterns in time series data to predict future events.
Anomaly Detection: Used for detecting fraud or network intrusions. These networks can identify
patterns in data that deviate drastically and flag them as potential anomalies.
Recommender Systems: In recommendation tasks like suggesting movies, music and books.
They learn user behavior patterns to provide personalized suggestions.
Video Analysis: Applied in tasks such as object detection, activity recognition and action
classification. When combined with Convolutional Neural Networks (CNNs) they help analyze video
data and extract useful information.

ECS Card Photograph Endorsement Form: Setting The Standards
No ratings yet
ECS Card Photograph Endorsement Form: Setting The Standards
1 page
Long Short-Term Memory (LSTM)
No ratings yet
Long Short-Term Memory (LSTM)
25 pages
Mohr's Circle
100% (1)
Mohr's Circle
13 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
LSTM: Recurrent Neural Network Guide
No ratings yet
LSTM: Recurrent Neural Network Guide
6 pages
Lesson 2 Political Ideologies
No ratings yet
Lesson 2 Political Ideologies
15 pages
Environment Consists of All Living and Non Living Things Which Surround Us
No ratings yet
Environment Consists of All Living and Non Living Things Which Surround Us
7 pages
6 - RNN LSTM & Gru
No ratings yet
6 - RNN LSTM & Gru
14 pages
EfkaPB2001 TDS
No ratings yet
EfkaPB2001 TDS
2 pages
LSTM
No ratings yet
LSTM
24 pages
Estimating & Measuring Work Within A Construction Environment
No ratings yet
Estimating & Measuring Work Within A Construction Environment
29 pages
RNNs & LSTMs for Tech Enthusiasts
No ratings yet
RNNs & LSTMs for Tech Enthusiasts
9 pages
Addition Multiplication RNN
No ratings yet
Addition Multiplication RNN
7 pages
LSTM
No ratings yet
LSTM
14 pages
Advanced RNN Architectures Explained
No ratings yet
Advanced RNN Architectures Explained
60 pages
Advance Java MCQ Exam Prep Book
No ratings yet
Advance Java MCQ Exam Prep Book
133 pages
LSTM
No ratings yet
LSTM
22 pages
Neural Networks
No ratings yet
Neural Networks
22 pages
Short Notes On Vanishing & Exploding Gradients
No ratings yet
Short Notes On Vanishing & Exploding Gradients
30 pages
Long Short-Term Memory Survey Paper
No ratings yet
Long Short-Term Memory Survey Paper
6 pages
EPJ LSTM Survey
No ratings yet
EPJ LSTM Survey
14 pages
LSTM Networks Thesis Updated
No ratings yet
LSTM Networks Thesis Updated
5 pages
LSTM
No ratings yet
LSTM
12 pages
Presentation Title
No ratings yet
Presentation Title
10 pages
Long Short-Term Memory (LSTM) by Mohsin
No ratings yet
Long Short-Term Memory (LSTM) by Mohsin
17 pages
Live Seminar: Subject - Operating System Date - 07 Dec, 2022
No ratings yet
Live Seminar: Subject - Operating System Date - 07 Dec, 2022
89 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
45 pages
Long Short-Term Memory
No ratings yet
Long Short-Term Memory
9 pages
Advance Java Chapter 1 Full Notes - Ur Engineering Friend-1
100% (1)
Advance Java Chapter 1 Full Notes - Ur Engineering Friend-1
33 pages
Advance Java - Exam Sutra MCQ Book by Ur Engineering Friend
100% (1)
Advance Java - Exam Sutra MCQ Book by Ur Engineering Friend
66 pages
Catalog Tong May Phat Dien Cummins
No ratings yet
Catalog Tong May Phat Dien Cummins
114 pages
Presentation 1
No ratings yet
Presentation 1
91 pages
Elements of Aeronautics Notes
No ratings yet
Elements of Aeronautics Notes
37 pages
Print Python
No ratings yet
Print Python
22 pages
What Is Kurtosis - Definition, Examples & Formula
100% (1)
What Is Kurtosis - Definition, Examples & Formula
10 pages
RNN 2
No ratings yet
RNN 2
144 pages
Additive Manufacturing For 3-Dimensional (3D) Structures: (Emphasis On 3D Printing)
No ratings yet
Additive Manufacturing For 3-Dimensional (3D) Structures: (Emphasis On 3D Printing)
153 pages
Chapter 1 Notes
No ratings yet
Chapter 1 Notes
9 pages
Introduction To Long Short Term Memory LSTM
No ratings yet
Introduction To Long Short Term Memory LSTM
6 pages
DBMS Ese
No ratings yet
DBMS Ese
96 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
How Does ChatGPT Work
No ratings yet
How Does ChatGPT Work
8 pages
Unlocking The Power of Long Short-Term Memory (LSTM) Networks - by Sachinsoni - Medium
No ratings yet
Unlocking The Power of Long Short-Term Memory (LSTM) Networks - by Sachinsoni - Medium
23 pages
CV Riston Belman Sidabutar
No ratings yet
CV Riston Belman Sidabutar
6 pages
DLT Unit-4
No ratings yet
DLT Unit-4
18 pages
34-Long-Term Dependencies - Echo State Networks - Long Short-Term Memory and Othe-03!10!2024
No ratings yet
34-Long-Term Dependencies - Echo State Networks - Long Short-Term Memory and Othe-03!10!2024
14 pages
DBMS 1-4
No ratings yet
DBMS 1-4
36 pages
LSTM Presentation
No ratings yet
LSTM Presentation
23 pages
LSTM PPT
No ratings yet
LSTM PPT
22 pages
LSTM 006
No ratings yet
LSTM 006
6 pages
LSTM
No ratings yet
LSTM
11 pages
Carl Jung: Everything About Other People That Doesn't Satisfy Us Helps Us To Better Understand Ourselves
No ratings yet
Carl Jung: Everything About Other People That Doesn't Satisfy Us Helps Us To Better Understand Ourselves
1 page
LSTM & Gru
No ratings yet
LSTM & Gru
17 pages
LSTM Material 1
No ratings yet
LSTM Material 1
3 pages
Split System Air Conditioners Manual
No ratings yet
Split System Air Conditioners Manual
20 pages
LSTM Overview for Biomedical Engineering
No ratings yet
LSTM Overview for Biomedical Engineering
16 pages
MGT - CO3 - All Assessment (5 Files Merged)
No ratings yet
MGT - CO3 - All Assessment (5 Files Merged)
18 pages
Database Concurrency Techniques
No ratings yet
Database Concurrency Techniques
31 pages
LSTM AryanGomes
No ratings yet
LSTM AryanGomes
13 pages
Electron Mi Kro Skop
No ratings yet
Electron Mi Kro Skop
4 pages
Blockchain-Based Secure Crowdfunding
No ratings yet
Blockchain-Based Secure Crowdfunding
5 pages
LSTM: Advanced RNN for Sequence Data
No ratings yet
LSTM: Advanced RNN for Sequence Data
12 pages
Indian Factory Act 1948 Provisions
No ratings yet
Indian Factory Act 1948 Provisions
9 pages
LSTM Networks in Python 1723896317
No ratings yet
LSTM Networks in Python 1723896317
17 pages
The Construction of Family in Selected Disney Animated Films
No ratings yet
The Construction of Family in Selected Disney Animated Films
4 pages
Understanding Skewness in Statistics
No ratings yet
Understanding Skewness in Statistics
10 pages
Target Chair Testing Protocol Guide
No ratings yet
Target Chair Testing Protocol Guide
12 pages
MAD EXP6 Prog
No ratings yet
MAD EXP6 Prog
5 pages
Case Study
No ratings yet
Case Study
2 pages
Eyewitness Memory Distortion
No ratings yet
Eyewitness Memory Distortion
10 pages
LSTM 1738024034
No ratings yet
LSTM 1738024034
13 pages
Cs224n 2025 Lecture06 Fancy RNN
No ratings yet
Cs224n 2025 Lecture06 Fancy RNN
57 pages
LSTMs: Tackling Vanishing Gradients
No ratings yet
LSTMs: Tackling Vanishing Gradients
17 pages
STE (22518) - 1st Class Test Question Bank
No ratings yet
STE (22518) - 1st Class Test Question Bank
1 page
What Is LSTM
No ratings yet
What Is LSTM
5 pages
Unit Iii
No ratings yet
Unit Iii
5 pages
Deep Learning (MODULE-5)
100% (1)
Deep Learning (MODULE-5)
71 pages
Week 6
No ratings yet
Week 6
60 pages
DL Co-3 PPT 3
No ratings yet
DL Co-3 PPT 3
19 pages
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
No ratings yet
Understanding LSTM - A Simple Guide With Diagrams and Real-Time Examples - by Neural Pai - Feb, 2025 - Medium
15 pages
LSTM
No ratings yet
LSTM
24 pages
Tent Pole Project Report
No ratings yet
Tent Pole Project Report
6 pages
Longshorttermmemorylstm 231215171600 1feb7b1b
No ratings yet
Longshorttermmemorylstm 231215171600 1feb7b1b
17 pages
LSTM
No ratings yet
LSTM
19 pages
Nelco N5000 BT Epoxy Laminate and Prepreg
No ratings yet
Nelco N5000 BT Epoxy Laminate and Prepreg
6 pages
Long Term Memory
No ratings yet
Long Term Memory
2 pages
C13-Rating A
100% (1)
C13-Rating A
5 pages
MCQ Pediatrics
No ratings yet
MCQ Pediatrics
5 pages
Data Analysis: Frequency Tables
No ratings yet
Data Analysis: Frequency Tables
1 page
Long Short-Term Memory (LSTM) : A Deep Dive Into Sequential Learning
No ratings yet
Long Short-Term Memory (LSTM) : A Deep Dive Into Sequential Learning
17 pages
RNNs and LSTMs
No ratings yet
RNNs and LSTMs
41 pages
Moral Panics Assignment
No ratings yet
Moral Panics Assignment
7 pages
Thesis Help for Trade Students
100% (2)
Thesis Help for Trade Students
6 pages
Kratus 2017 Music Listening Is Creative
No ratings yet
Kratus 2017 Music Listening Is Creative
6 pages
Long Short-Term Memory Networks (LSTM) - Simply Explained! - Data Basecamp
No ratings yet
Long Short-Term Memory Networks (LSTM) - Simply Explained! - Data Basecamp
4 pages
The Practice of Ecological Art Sacha KAGAN, Institute of Sociology 2014
No ratings yet
The Practice of Ecological Art Sacha KAGAN, Institute of Sociology 2014
7 pages
2016 - The Episteme Journal of Linguistics and Literature Vol 2 No 3 - 2.nima Saragi Analysis of Conditional OBAMA Speech
No ratings yet
2016 - The Episteme Journal of Linguistics and Literature Vol 2 No 3 - 2.nima Saragi Analysis of Conditional OBAMA Speech
28 pages
LSTM Detailed Explanation
No ratings yet
LSTM Detailed Explanation
2 pages
Simple Interest - Aptitude Questions and Answers
No ratings yet
Simple Interest - Aptitude Questions and Answers
3 pages
Assignment 1
No ratings yet
Assignment 1
1 page
Average - Aptitude Questions and Answers
No ratings yet
Average - Aptitude Questions and Answers
3 pages
Seminar-For CA-1 of Machine Learning-10200121006
No ratings yet
Seminar-For CA-1 of Machine Learning-10200121006
12 pages
NLP - L8 LSTM
No ratings yet
NLP - L8 LSTM
7 pages
HMI Viva
No ratings yet
HMI Viva
1 page
LSTM&RNN
No ratings yet
LSTM&RNN
10 pages
6GAN
No ratings yet
6GAN
4 pages
Colors
No ratings yet
Colors
17 pages
HMI Viva QB
No ratings yet
HMI Viva QB
15 pages
LSTM Presentation 1
No ratings yet
LSTM Presentation 1
10 pages
LSTM & Bi-LSTM
No ratings yet
LSTM & Bi-LSTM
30 pages
Time and Distance - Aptitude Questions and Answers
No ratings yet
Time and Distance - Aptitude Questions and Answers
3 pages
RLAIF Research (Paper
No ratings yet
RLAIF Research (Paper
29 pages
CSE465 T7b LSTM
No ratings yet
CSE465 T7b LSTM
23 pages
Time and Work - Aptitude Questions and Answers
No ratings yet
Time and Work - Aptitude Questions and Answers
3 pages
Time and Work General Questions - Aptitude Questions and Answers Page 2
No ratings yet
Time and Work General Questions - Aptitude Questions and Answers Page 2
4 pages

LSTM

Uploaded by

LSTM

Uploaded by

What is LSTM – Long Short Term Memory?

Last Updated : 05 Apr, 2025

Problem with Long-Term Dependencies in RNN

Input gate: Controls what information is added to the memory cell.

The equation for the forget gate is:

C^t = tanh(Wc ⋅ [ht−1 , xt ] + bc )

⊙ denotes element-wise multiplication

Bidirectional LSTM Model

You might also like