0% found this document useful (0 votes)

23 views12 pages

Lecture 1.1 Gradient Descent Algorithm

The document discusses closed-form equations and various types of gradient descent (Batch, Stochastic, Mini-batch) used in optimization for machine learning. It explains the definitions, properties, advantages, and disadvantages of each gradient descent type, along with mathematical formulations and numerical examples. The content is presented by Dr. Mainak Biswas and emphasizes the importance of these concepts in minimizing loss functions.

Uploaded by

Biswas Lectures

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views12 pages

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Biswas Lectures

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Lecture 1.

Closed-form Equation, Type of Gradient

Descent
(Batch, Stochastic, Mini-batch) -
Definition, properties.

Dr. Mainak Biswas

Closed-form Equation
• Closed-form Equation: closed-form equation is
a mathematical expression that provides a
direct way to compute a value without
requiring iterative procedures or infinite series
– Example:
• Sum of an Arithmetic Series: The sum of the first n
terms of an arithmetic series with the first term a and
common difference d is:
𝑛
𝑆𝑛 = (2𝑎 + 𝑛 − 1 𝑑
2

Dr. Mainak Biswas

Gradient Descent
• Gradient Descent is an optimization algorithm used in
machine learning and deep learning to minimize the
loss function by updating the model's parameters in
the direction of the steepest descent
• The type of gradient descent depends on how much
data is used to compute the gradient at each iteration
• Gradient descent is also called “the deepest downward
slope algorithm”
• It is very important in machine learning, where it is
used to minimize a cost function

Dr. Mainak Biswas

Dr. Mainak Biswas
Loss function
𝑁
1 2
𝐸 𝑤 = 𝑓 𝑥𝑖 − 𝑦𝑖
2𝑁
𝑖=1
• Where 𝑓 𝑥𝑖 = 𝑤 𝑇 𝑥𝑖 , then
𝑁
𝜕𝐸 1
= 𝑓 𝑥𝑖 − 𝑦𝑖 𝑥𝑖
𝜕𝑤 𝑁
𝑖=1

Dr. Mainak Biswas

Mathematical Formulation of Gradient
Descent

𝑤 = 𝑤 − 𝜂𝛻𝐸(𝑤)
• 𝑤 : Model parameters (weights)
• 𝜂 : Learning rate
• 𝛻𝐸(𝑤): Gradient of the loss function 𝐸(𝑤) with
respect to 𝑤
• It can be also written as:
𝑁
1
𝑤 =𝑤−𝜂 𝑓 𝑥𝑖 − 𝑦𝑖 𝑥𝑖
𝑁
𝑖=1

Dr. Mainak Biswas

Dr. Mainak Biswas
Numerical Problem
• Let 𝐸 𝑤 = 𝑤 − 3 2 + 2, 𝜂 = 0.1, 𝑤 = 0 ,
then find w and E(w) for five iterations:
• So, we see 𝑥𝑖 = 1, therefore iterations can be
solved in terms of w only
𝛿𝐸
• =2 𝑤−3
𝛿𝑤
• 𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂𝛻𝐸 𝑤𝑜𝑙𝑑 ⇒ 0.8𝑤𝑜𝑙𝑑 + 0.6

Sl w 𝐸 𝑤
1 0 11
2 0.6 7.76

Dr. Mainak Biswas

Sl w 𝐸 𝑤
1 0 11
2 0.6 7.76
2 1.0800 5.6864
2 1.4640 4.3593
2 1.7712 3.5099

Dr. Mainak Biswas

Batch Gradient Descent
• Batch Gradient Descent is an optimization algorithm
used to minimize a loss function by iteratively updating
the model's parameters using the entire dataset to
calculate the gradient
• Advantages:
– Computes the gradient with high precision using the entire
dataset
– Converges steadily towards the minimum
– Suitable for smooth and convex loss functions
• Disadvantages:
– Memory-intensive when the dataset is large
– Requires processing the entire dataset for each iteration

Dr. Mainak Biswas

Stochastic Gradient Descent
• Stochastic Gradient Descent (SGD) is a variant of gradient
descent where the model parameters are updated using
only a single training example at a time, rather than the
entire dataset
• This leads to faster updates and can help the algorithm
escape local minima, making it suitable for large datasets
• Advantages
– Faster Updates
– Escaping Local Minima
– Scalability
• Disadvantages
– Noisy Convergence
– Requires More Iterations

Dr. Mainak Biswas

Mini-Batch Gradient Descent
• Mini-batch Gradient Descent is a hybrid approach between Batch
Gradient Descent and Stochastic Gradient Descent. It aims to combine the
advantages of both by updating the model parameters using a subset
(mini-batch) of the training data rather than the entire dataset (batch) or
just one data point (stochastic)
– Mini-batch: The dataset is divided into small batches, each containing a fixed
number of training examples (The size of each mini-batch (denoted as 𝑏) is a
hyper-parameter)
– Gradient Calculation: For each mini-batch, the gradient is calculated based on
the average of the training examples in that batch
– Weight Update: The model parameters are updated using the computed
gradient for the mini-batch
– Repeat for all mini-batches until convergence
• Advantages: Faster than Batch GD, Less Noisy than SGD
• Disadvantages: Choosing the Right Batch Size, Memory Considerations

Dr. Mainak Biswas

UNIT3
No ratings yet
UNIT3
37 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
98 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
40 pages
NN WK 3 Lec 5 6 Gradient Descent
No ratings yet
NN WK 3 Lec 5 6 Gradient Descent
7 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
Gradient Descent Regression
No ratings yet
Gradient Descent Regression
14 pages
Optimization Gradient Descent
No ratings yet
Optimization Gradient Descent
13 pages
UNIT2
No ratings yet
UNIT2
25 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Linear Models & Optimization Techniques
No ratings yet
Linear Models & Optimization Techniques
24 pages
Linear Models-Gradient Descent, Regularization (Introduction)
No ratings yet
Linear Models-Gradient Descent, Regularization (Introduction)
26 pages
PCA and Convex Optimization and Bias, Variance-2
No ratings yet
PCA and Convex Optimization and Bias, Variance-2
29 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
Machine Learning: Gradient Descent Methods
No ratings yet
Machine Learning: Gradient Descent Methods
11 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Gradient Decent
No ratings yet
Gradient Decent
40 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
Optim
No ratings yet
Optim
33 pages
DL Exp2
No ratings yet
DL Exp2
6 pages
SGD 2
No ratings yet
SGD 2
18 pages
Mlfa Autumn 23 Optimization
No ratings yet
Mlfa Autumn 23 Optimization
37 pages
Gradient Descent New
No ratings yet
Gradient Descent New
42 pages
5 Optimizers
No ratings yet
5 Optimizers
10 pages
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
No ratings yet
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
2 pages
An Overview of Gradient Descent Optimization Algorithms PDF
No ratings yet
An Overview of Gradient Descent Optimization Algorithms PDF
12 pages
Different Types of Gradient Descent
No ratings yet
Different Types of Gradient Descent
4 pages
Op Tim Ization
No ratings yet
Op Tim Ization
37 pages
Deep Learning Optimizers Explained
No ratings yet
Deep Learning Optimizers Explained
20 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
Gradient Descent
No ratings yet
Gradient Descent
27 pages
SGD 1
No ratings yet
SGD 1
86 pages
Gradient Descent Method
No ratings yet
Gradient Descent Method
12 pages
ECS171: Machine Learning: Lecture 4: Optimization (LFD 3.3, SGD)
No ratings yet
ECS171: Machine Learning: Lecture 4: Optimization (LFD 3.3, SGD)
45 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
ML Lecture2
No ratings yet
ML Lecture2
36 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
Assignment 4
No ratings yet
Assignment 4
8 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Lesson 4 Gradient Descent
No ratings yet
Lesson 4 Gradient Descent
13 pages
05.stochastic Gradient Descent
No ratings yet
05.stochastic Gradient Descent
2 pages
SGD
No ratings yet
SGD
3 pages
Topic5 Stoch Grad D Oct202023
No ratings yet
Topic5 Stoch Grad D Oct202023
29 pages
Deep Learning: Course Code: Unit 1
No ratings yet
Deep Learning: Course Code: Unit 1
41 pages
Assignment No 3
No ratings yet
Assignment No 3
7 pages
3 TrainingNetwork
No ratings yet
3 TrainingNetwork
65 pages
Gradient Descent Types Explained
No ratings yet
Gradient Descent Types Explained
5 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
Gradient Descent & Logistic Regression
No ratings yet
Gradient Descent & Logistic Regression
2 pages
4 - Gradient Descent and Stochastic GD
No ratings yet
4 - Gradient Descent and Stochastic GD
37 pages
Screenshot 2024-10-19 at 10.37.25 AM
No ratings yet
Screenshot 2024-10-19 at 10.37.25 AM
25 pages
Gradient Descent for ML Practitioners
No ratings yet
Gradient Descent for ML Practitioners
2 pages
Unit 4 - GRADIENT LEARNING
No ratings yet
Unit 4 - GRADIENT LEARNING
3 pages
Unit 4 Final
No ratings yet
Unit 4 Final
29 pages
2,5 Stochastic Gradient Descent
No ratings yet
2,5 Stochastic Gradient Descent
11 pages
Ch2-Training, Optimization and Regularization of DNN-new
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new
114 pages
Mlfa Autumn 22 Lec 04
No ratings yet
Mlfa Autumn 22 Lec 04
24 pages
Lab # 13 Two Dimentional Arrays in C++
100% (1)
Lab # 13 Two Dimentional Arrays in C++
4 pages
Oop With Java Programming Lab Program
No ratings yet
Oop With Java Programming Lab Program
20 pages
Lecture 10 Basic CNN
No ratings yet
Lecture 10 Basic CNN
65 pages
The Hong Kong Polytechnic University: Reference Checklist (Confidential)
No ratings yet
The Hong Kong Polytechnic University: Reference Checklist (Confidential)
4 pages
Excel HLOOKUP Function
No ratings yet
Excel HLOOKUP Function
4 pages
MATLAB & Simulink Intro Guide
No ratings yet
MATLAB & Simulink Intro Guide
47 pages
How To Create The Sample Function Called SpellNumber
0% (1)
How To Create The Sample Function Called SpellNumber
3 pages
Quick Sort
No ratings yet
Quick Sort
15 pages
Yozolog
No ratings yet
Yozolog
8 pages
Computer GuessPaper 2025
No ratings yet
Computer GuessPaper 2025
8 pages
Dbms DTU Lab
No ratings yet
Dbms DTU Lab
40 pages
Chtp5e Pie SM 09
50% (2)
Chtp5e Pie SM 09
12 pages
Unit-5 Event Driven
No ratings yet
Unit-5 Event Driven
47 pages
Real-Time 4K Face Detection with OpenCV
No ratings yet
Real-Time 4K Face Detection with OpenCV
19 pages
A Software Defect Prediction System Using Cohesion Metrics: Click To Edit Master Subtitle Style Under The Guidance of
No ratings yet
A Software Defect Prediction System Using Cohesion Metrics: Click To Edit Master Subtitle Style Under The Guidance of
29 pages
STM R16 - Unit-3
No ratings yet
STM R16 - Unit-3
56 pages
Python Lab Guide for VTU Students
No ratings yet
Python Lab Guide for VTU Students
58 pages
Distributed Systems Communication
No ratings yet
Distributed Systems Communication
14 pages
Assignment For Master in Computer Application: (5 Year Integrated)
No ratings yet
Assignment For Master in Computer Application: (5 Year Integrated)
32 pages
Swe-102 Lab 09
No ratings yet
Swe-102 Lab 09
10 pages
Yogi Prakoso - Cake (CONTOH CV)
No ratings yet
Yogi Prakoso - Cake (CONTOH CV)
6 pages
Software Testing Skills Resume
No ratings yet
Software Testing Skills Resume
2 pages
Chapter 5 of PHP (WBP)
No ratings yet
Chapter 5 of PHP (WBP)
25 pages
Java Programs for Class X Students
No ratings yet
Java Programs for Class X Students
68 pages
Cse Department - ANNA UNIVERSITY Important Question and Answers - Regulation 2013,2017 - STUDY MATERIAL, Notes
No ratings yet
Cse Department - ANNA UNIVERSITY Important Question and Answers - Regulation 2013,2017 - STUDY MATERIAL, Notes
5 pages
Java String Constructors Lab
No ratings yet
Java String Constructors Lab
1 page
Design Notation and Specification Software Engg
No ratings yet
Design Notation and Specification Software Engg
9 pages
My HTML Quiz
No ratings yet
My HTML Quiz
7 pages
Full TypeScript Basics: Learn TypeScript From Scratch and Solidify Your Skills With Projects 1st Edition Nabendu Biswas PDF All Chapters
100% (6)
Full TypeScript Basics: Learn TypeScript From Scratch and Solidify Your Skills With Projects 1st Edition Nabendu Biswas PDF All Chapters
61 pages
Chapter 6 LL K and LR K Grammars Formal Language Automata Theory
No ratings yet
Chapter 6 LL K and LR K Grammars Formal Language Automata Theory
13 pages

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Lecture 1.

Closed-form Equation, Type of Gradient

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

You might also like