0% found this document useful (0 votes)

21 views11 pages

What Is Gradient Descent - Built in

What Is Gradient Descent

Uploaded by

Sam Smith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views11 pages

What Is Gradient Descent - Built in

What Is Gradient Descent

Uploaded by

Sam Smith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Gradient Descent in Machine bumpsskkier@gmail.

com

08/07/2024, 18:50
Learning: A Basic Introduction What Is Gradient Descent? | Built In Continue as Sam

To create your account, Google will share your name, email

Take a high-level look into gradient descent — one of ML's most popular algorithms. picture with builtin.com. See builtin.com's privacy policy and

Written by Niklas Donges

Image: Shutterstock / Built In

UPDATED BY REVIEWED BY
Jessica Powers | Mar 27, 2023 Parul Pandey

 Gradient descent is by far the most popular optimization strategy used in machine learning
and deep learning at the moment. It is used when training data models, can be combined with
 every algorithm and is easy to understand and implement.
 Everyone working with machine
FOR EMPLOYERS JOIN LOG IN

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 1/11
How to Solve Gradient Descent Challenges
08/07/2024, 18:50 Types of Gradient Descent: Batch, Stochastic,
What IsMini-Batch
Gradient Descent? | Built In

Gradient Descent : Data Science Concepts

A video overview of gradient descent. Video: ritvikmath

Introduction to Gradient Descent

Gradient descent is an optimization algorithm that’s used when training a machine learning
model. It’s based on a convex function and tweaks its parameters iteratively to minimize a
given function to its local minimum.

What Is Gradient Descent in Machine Learning?

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 2/11
function. To understand this concept fully, it’s important to know about gradients.
08/07/2024, 18:50 What Is Gradient Descent? | Built In

What Is a Gradient?

"A gradient measures how much the output of a function changes if

you change the inputs a little bit." — Lex Fridman (MIT)

A gradient simply measures the change in all weights with regard to the change in error. You
can also think of a gradient as the slope of a function. The higher the gradient, the steeper the
slope and the faster a model can learn. But if the slope is zero, the model stops learning. In
mathematical terms, a gradient is a partial derivative with respect to its inputs.

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 3/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

Image: Shutterstock

Imagine a blindfolded man who wants to climb to the top of a hill with the fewest steps along
the way as possible. He might start climbing the hill by taking really big steps in the steepest
direction, which he can do as long as he is not close to the top. As he comes closer to the top,
however, his steps will get smaller and smaller to avoid overshooting it. This process can be
described mathematically using the gradient.

Imagine the image below illustrates our hill from a top-down view and the red arrows are the
steps of our climber. Think of a gradient in this context as a vector that contains the direction
of the steepest step the blindfolded man can take and also how long that step should be.

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 4/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

Image: Niklas Donges

Note that the gradient ranging from X0 to X1 is much longer than the one reaching from X3 to
X4. This is because the steepness/slope of the hill, which determines the length of the
vector, is less. This perfectly represents the example of the hill because the hill is getting less
steep the higher it’s climbed. Therefore a reduced gradient goes along with a reduced slope
and a reduced step size for the hill climber.

How Gradient Descent Works

Instead of climbing up a hill, think of gradient descent as hiking down to the bottom of a
valley. This is a better analogy because it is a minimization algorithm that minimizes a given
function.

The equation below describes what the gradient descent algorithm does: b is the next position
of our climber, while a represents his current position. The minus sign refers to the
minimization part of the gradient descent algorithm. The gamma in the middle is a waiting
factor and the gradient term ( Δf(a) ) is simply the direction of the steepest descent.

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 5/11
08/07/2024, 18:50 Imagine you have a machine learning problem
Whatand want toDescent?
Is Gradient train your algorithm
| Built In with gradient
descent to minimize your cost-function J(w, b) and reach its local minimum by tweaking its
parameters (w and b). The image below shows the horizontal axes representing the
parameters (w and b), while the cost function J(w, b) is represented on the vertical axes.
Gradient descent is a convex function.

Image: Niklas Donges

We know we want to find the values of w and b that correspond to the minimum of the cost
function (marked with the red arrow). To start finding the right values we initialize w and
b with some random numbers. Gradient descent then starts at that point (somewhere around
the top of our illustration), and it takes one step after another in the steepest downside

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 6/11
optimal weights.
08/07/2024, 18:50 What Is Gradient Descent? | Built In
For the gradient descent algorithm to reach the local minimum we must set the learning rate
to an appropriate value, which is neither too low nor too high. This is important because if the
steps it takes are too big, it may not reach the local minimum because it bounces back and
forth between the convex function of gradient descent (see left image below). If we set the
learning rate to a very small value, gradient descent will eventually reach the local minimum
but that may take a while (see the right image).

Image: Niklas Donges

So, the learning rate should never be too high or too low for this reason. You can check if your
learning rate is doing well by plotting it on a graph.

How to Solve Gradient Descent Challenges

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 7/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

Image: Niklas Donges

If the gradient descent algorithm is working properly, the cost function should decrease after
every iteration.

When gradient descent can’t decrease the cost-function anymore and remains more or less on
the same level, it has converged. The number of iterations gradient descent needs to converge
can sometimes vary a lot. It can take 50 iterations, 60,000 or maybe even 3 million, making
the number of iterations to convergence hard to estimate in advance.

There are some algorithms that can automatically tell you if gradient descent has converged,
but you must define a threshold for the convergence beforehand, which is also pretty hard to
estimate. For this reason, simple plots are the preferred convergence test.

Another advantage of monitoring gradient descent via plots is it allows us to easily spot if it
doesn’t work properly, for example if the cost function is increasing. Most of the time the
reason for an increasing cost-function when using gradient descent is a learning rate that's too
high.

If the plot shows the learning curve just going up and down, without really reaching a lower
point, try decreasing the learning rate. Also, when starting out with gradient descent on a

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 8/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

Batch Gradient Descent

Batch gradient descent, also called vanilla gradient descent, calculates the error for each
example within the training dataset, but only after all training examples have been evaluated
does the model get updated. This whole process is like a cycle and it’s called a training epoch.

Some advantages of batch gradient descent are its computational efficiency: it produces a
stable error gradient and a stable convergence. Some disadvantages are that the stable error
gradient can sometimes result in a state of convergence that isn’t the best the model can
achieve. It also requires the entire training dataset to be in memory and available to the
algorithm.

Stochastic Gradient Descent

By contrast, stochastic gradient descent (SGD) does this for each training example within the
dataset, meaning it updates the parameters for each training example one by one. Depending
on the problem, this can make SGD faster than batch gradient descent. One advantage is the
frequent updates allow us to have a pretty detailed rate of improvement.

The frequent updates, however, are more computationally expensive than the batch gradient
descent approach. Additionally, the frequency of those updates can result in noisy gradients,
which may cause the error rate to jump around instead of slowly decreasing.

Mini-Batch Gradient Descent

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 9/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

Recent Machine Learning Algorithms Articles

AI Is an Energy Glutton. That Needs to Stop.

What Is Digital Manufacturing?

The 35 Best AI Tools to Know

About Get Involved Resources Tech Hubs

Our Story Recruit With Built Customer Our Sites
In Support
Careers
Become an Share Feedback
Our Staff Writers Expert
Contributor Report a Bug
Built In is the online Content
community for startups Descriptions Browse Jobs
and tech companies.
Find startup jobs, tech Tech A-Z
news and events.

   

Learning Lab User Accessibility Copyright Privacy Terms of Your Privacy Choices/Cookie CA Notice of
Agreement Statement Policy Policy Use Settings Collection

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 10/11
08/07/2024, 18:50 What Is Gradient Descent? | Built In

https://builtin.com/data-science/gradient-descent#:~:text=Gradient Descent is an optimization algorithm for finding a local,function as far as possible. 11/11

Bio (In Focus Year 12)
67% (3)
Bio (In Focus Year 12)
636 pages
425796316-COBIT-2019-Framework-Governance-and-Management-Objectives English
100% (4)
425796316-COBIT-2019-Framework-Governance-and-Management-Objectives English
288 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
Gradient Descent - A Quick, Simple Introduction - Built in
No ratings yet
Gradient Descent - A Quick, Simple Introduction - Built in
15 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Understanding Cost Function & Gradient Descent
No ratings yet
Understanding Cost Function & Gradient Descent
142 pages
Understanding Gradient Descent in ML
No ratings yet
Understanding Gradient Descent in ML
9 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
Introduction To Gradient Descent
No ratings yet
Introduction To Gradient Descent
8 pages
Gradient Descent
No ratings yet
Gradient Descent
18 pages
GD Algo
No ratings yet
GD Algo
18 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Adam Optimizer
No ratings yet
Adam Optimizer
22 pages
Gradient Descent Explained Simply
No ratings yet
Gradient Descent Explained Simply
16 pages
Gradient Descent in Machine Learning - Javatpoint
No ratings yet
Gradient Descent in Machine Learning - Javatpoint
9 pages
Gradient Descent: By-Vineet Ahuja BCA-V1-E 00221102021
No ratings yet
Gradient Descent: By-Vineet Ahuja BCA-V1-E 00221102021
10 pages
Gradient Descent for ML Beginners
No ratings yet
Gradient Descent for ML Beginners
11 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
8 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
3 pages
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
No ratings yet
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
24 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Gradient Descent
No ratings yet
Gradient Descent
14 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
LInear
No ratings yet
LInear
14 pages
Gradient Descend
No ratings yet
Gradient Descend
64 pages
Gradient Descent
No ratings yet
Gradient Descent
12 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
Deep Learning (Part 8) - Coursesteach
No ratings yet
Deep Learning (Part 8) - Coursesteach
16 pages
AI33
No ratings yet
AI33
6 pages
Gradient Decent
No ratings yet
Gradient Decent
40 pages
Assignment 4
No ratings yet
Assignment 4
8 pages
Gradient Descent in Linear Regression
No ratings yet
Gradient Descent in Linear Regression
30 pages
Gradient Descent
No ratings yet
Gradient Descent
12 pages
Gradient Descent Final
No ratings yet
Gradient Descent Final
27 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
Gradient Descent Explained. A Comprehensive Guide To Gradient - by Daksh Trehan - Towards Data Science
No ratings yet
Gradient Descent Explained. A Comprehensive Guide To Gradient - by Daksh Trehan - Towards Data Science
9 pages
ML Lecture # 03 Gradient Descent
No ratings yet
ML Lecture # 03 Gradient Descent
23 pages
MAT6007 - Session8 - Gradient Descent
No ratings yet
MAT6007 - Session8 - Gradient Descent
16 pages
Gradient Descent for Beginners
No ratings yet
Gradient Descent for Beginners
15 pages
04gradient Descent
No ratings yet
04gradient Descent
21 pages
chp2 Gradient Descent Algorithm
No ratings yet
chp2 Gradient Descent Algorithm
5 pages
Assignment B 4 GradientDescent
No ratings yet
Assignment B 4 GradientDescent
5 pages
Gradient Descent Algorithm.Y...
No ratings yet
Gradient Descent Algorithm.Y...
10 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Assignment No 3
No ratings yet
Assignment No 3
7 pages
11-Descida de Gradiente
No ratings yet
11-Descida de Gradiente
3 pages
Gradient Descent for ML Experts
No ratings yet
Gradient Descent for ML Experts
5 pages
Paper 2
No ratings yet
Paper 2
27 pages
Gradient Descent Presentation
No ratings yet
Gradient Descent Presentation
26 pages
Gradient Descent
No ratings yet
Gradient Descent
27 pages
Gradient Descent for Beginners
No ratings yet
Gradient Descent for Beginners
8 pages
Gradient Descent: Rohit Sharma Pushpendra Kumar Sharma
No ratings yet
Gradient Descent: Rohit Sharma Pushpendra Kumar Sharma
12 pages
Gradient Descent Algorithm
No ratings yet
Gradient Descent Algorithm
24 pages
BROOKE Burke
No ratings yet
BROOKE Burke
8 pages
Leetcode Upload
No ratings yet
Leetcode Upload
4 pages
Obturator Stretch Dealing With Chronic Pelvic Pain - Embody Health Centre
No ratings yet
Obturator Stretch Dealing With Chronic Pelvic Pain - Embody Health Centre
7 pages
SGD
No ratings yet
SGD
3 pages
Daphnia Heart Rate Experiment Guide
No ratings yet
Daphnia Heart Rate Experiment Guide
7 pages
On Not Teaching Greek
100% (1)
On Not Teaching Greek
7 pages
Aspen HYSYS Pump, Compressor, Expander, and Heat Exchanger Simulations
No ratings yet
Aspen HYSYS Pump, Compressor, Expander, and Heat Exchanger Simulations
22 pages
Chapter 7
No ratings yet
Chapter 7
49 pages
SKF3013 - Manual Amali PDF
No ratings yet
SKF3013 - Manual Amali PDF
26 pages
TPS6106x Constant Current LED Driver With Digital and PWM Brightness Control
No ratings yet
TPS6106x Constant Current LED Driver With Digital and PWM Brightness Control
29 pages
Types of Concrete: Ar. C.N.Vaishnavi Ar. M.Padma
No ratings yet
Types of Concrete: Ar. C.N.Vaishnavi Ar. M.Padma
23 pages
Surat Undangan Peserta ADIA
No ratings yet
Surat Undangan Peserta ADIA
9 pages
Classic 500
No ratings yet
Classic 500
86 pages
Procurement Documents
100% (1)
Procurement Documents
3 pages
Agisoft Metashape Updates
No ratings yet
Agisoft Metashape Updates
41 pages
Science 10 Lesson Plan
100% (1)
Science 10 Lesson Plan
7 pages
Written Assignment Unit 4
No ratings yet
Written Assignment Unit 4
5 pages
James Dobson Homework
100% (1)
James Dobson Homework
6 pages
Audio Compression Using Wavelet Techniques: Project Report
No ratings yet
Audio Compression Using Wavelet Techniques: Project Report
41 pages
Estimating & Measuring Work Within A Construction Environment
No ratings yet
Estimating & Measuring Work Within A Construction Environment
29 pages
U3 w22 Revision 4b (Handout)
No ratings yet
U3 w22 Revision 4b (Handout)
12 pages
BHU RET Geology 2020
0% (1)
BHU RET Geology 2020
41 pages
SPSS: A Tool For Survey Analysis: Alok Kumar PGDM 2 Year
No ratings yet
SPSS: A Tool For Survey Analysis: Alok Kumar PGDM 2 Year
3 pages
GOC GMDSS 60 Questions
No ratings yet
GOC GMDSS 60 Questions
23 pages
Custom DateTimePicker - Custom Controls WinForm C # - RJ Code Advance
No ratings yet
Custom DateTimePicker - Custom Controls WinForm C # - RJ Code Advance
12 pages
Effectiveness of Structured Teaching Programme On Knowledge Regarding Acid Peptic Disease and Its Prevention Among The Industrial Workers
No ratings yet
Effectiveness of Structured Teaching Programme On Knowledge Regarding Acid Peptic Disease and Its Prevention Among The Industrial Workers
6 pages
Industrial Equipment Wiring Guide
No ratings yet
Industrial Equipment Wiring Guide
1 page
SOW102-Doing Social Research, 2nd Edition-Therese Baker-1994 - (Learnclax - Com) - Pages-200-235
No ratings yet
SOW102-Doing Social Research, 2nd Edition-Therese Baker-1994 - (Learnclax - Com) - Pages-200-235
36 pages
FPA-21 PG 70 ABV
No ratings yet
FPA-21 PG 70 ABV
1 page
MRSPTU M.tech. Mechanical Engg. (Sem 1-4) Syllabus Updated On 19.3.2017
No ratings yet
MRSPTU M.tech. Mechanical Engg. (Sem 1-4) Syllabus Updated On 19.3.2017
15 pages
1999-2000 SUSPENSION Front - Avalon, Camry, Camry Solara, Celica, Corolla, Echo, RAV4 & SiennaFront Suspension
75% (4)
1999-2000 SUSPENSION Front - Avalon, Camry, Camry Solara, Celica, Corolla, Echo, RAV4 & SiennaFront Suspension
22 pages
Uninterruptible Power Supply (UPS)
No ratings yet
Uninterruptible Power Supply (UPS)
11 pages