Multidimensional Gradient Methods
in Optimization
by
Dr. Md. Rajibul Islam
CSE, UAP
10/11/22 1
Steepest Ascent/Descent
Method
Multidimensional Gradient
Methods -Overview
Use information from the derivatives of
the optimization function to guide the
search
Finds solutions quicker compared with
direct search methods
A good initial estimate of the solution is
required
The objective function needs to be
differentiable
3
Gradients
The gradient is a vector operator denoted by
(referred to as “del”)
When applied to a function , it represents the
functions directional derivatives
The gradient is the special case where the
direction of the gradient is the direction of most
or the steepest ascent/descent
The gradient is calculated by
f f
f i j
x y
4
Gradients-Example
Calculate the gradient to determine the direction of the
steepest slope at point (2, 1) for the function f x, y x 2 y 2
Solution: To calculate the gradient we would need to
calculate
f f
2 2
2 xy 2(2)(1) 4 2 x 2 y 2(2) 2 (1) 8
x y
which are used to determine the gradient at point (2,1)
as
f 4i 8 j
5
Hessians
The Hessian matrix or just the Hessian is the
Jacobian matrix of second-order partial
derivatives of a function.
The determinant of the Hessian matrix is also
referred to as the Hessian.
For a two dimensional function the Hessian
matrix is simply
2 f 2 f
2
x xy
H 2
f 2 f
yx y 2
6
Hessians cont.
The determinant of the Hessian matrix
denoted by H can have three cases:
1. If H 0and 2 f / 2 x 2 0then f x, y has a
local minimum.
2. If H 0and 2
f / x 0 then f x, y has a
2 2
local maximum.
3. If H 0 then f x, y has
a saddle point.
7
Hessians-Example
Calculate the hessian matrix at point (2, 1) for the
function f x, y x 2 y 2
Solution: To calculate the Hessian matrix; the partial
derivatives must be evaluated as
2
f 2 f 2 f 2 f
2 y 2 2(1) 2 4 2 x 2
2( 2) 2
8 4 xy 4(2)(1) 8
2 2
x y 2 xy yx
resulting in the Hessian matrix
2 f 2 f
2
x xy 4 8
H 2
f f 8 8
2
yx y 2
8
Steepest Ascent/Descent Method
Starts from an initial point and looks for
a local optimal solution along a
gradient.
The gradient at the initial solution is
calculated.
A new solution is found at the local
optimum along the gradient
The subsequent iterations involve using
the local optima along the new gradient
as the initial point.
9
Example
Determine the minimum of the function
2 2
f x, y x y 2 x 4
Use the point (2,1) as the initial estimate of the optimal
solution.
10
Solution
Iteration 1: To calculate the gradient; the partial derivatives
must be evaluated as
f f
2 x 2 2(2) 2 6 2 y 2(1) 2
x y
f 6i 2 j
Now the function f x, y can be expressed along the direction
of gradient as
f f
f x0 h, y0 h f (2 6h,1 2h) (2 6h) 2 (1 2h) 2 2(2 6h) 4
x y
g (h) 40h 2 40h 13
11
Solution Cont.
Iteration 1 continued:
This is a simple function and it is easy to determine h* 0.5
by taking the first derivative and solving for its roots.
This means that traveling a step size of h 0.5 along the
gradient reaches a minimum value for the function in this
direction. These values are substituted back to calculate a
new value for x and y as follows:
x 2 6(0.5) 1
y 1 2(0.5) 0
Note that f 2,1 13 f 1,0 3.0
12
Solution Cont.
Iteration 2: The new initial point is 1,0 .We calculate the
gradient at this point as
f f
2 x 2 2(1) 2 0 2 y 2(0) 0
x y
f 0i 0 j
This indicates that the current location is a local optimum
along this gradient and no improvement can be gained by
moving in any direction. The minimum of the function is
2 2
f
at point (-1,0) and min ( 1) ( 0) 2(1) 4 3 .
13
THE END