Case Study: Quadratic Function Optimization
Let’s consider optimizing the quadratic function:
1 T T
f ( x )= x Qx−b x
2
where
Q=
[ 41 13] , b=[12]
This is a convex function with a unique minimum at x ¿=Q−1 b .
Step 1: Initialization
We choose an initial guess:
x 0=
[ 21]
The initial inverse Hessian approximation is the identity matrix:
H 0=I =
[ 10 01]
Now, your turn! Compute the initial gradient g0=∇ f ( x 0 ) . What do you get?
(RESPONSE): We compute the initial gradient:
g0=∇ f ( x 0 ) =Q x 0 −b
Substituting the given values:
g0 =
[ 41 13] [ 21]−[ 12]
Perform the matrix-vector multiplication:
[ ][ ][]
4 ( 2 )+1 ( 1 )
1 ( 2 ) +3 ( 1 )
=
8+1
2+3
=
9
5
Subtracting b :
g0 =
[ 95]−[ 12]=[ 83]
Thus, the initial gradient is:
g0 =
[]
8
3
Now, compute the search direction using the initial inverse Hessian
approximation:
p0=−H 0 g 0
What do you get?
We compute the search direction:
p0=−H 0 g 0
Since H 0 is the identity matrix:
p0=−I g0=−
[ 10 01][ 83]
Multiplying by I (which keeps the vector unchanged):
p0=−
[ 83]=[−8
−3 ]
Thus, the search direction is:
p0 =
[−8
−3 ]
Now, perform a line search to determine the optimal step size α 0. We seek
α 0 >0 such that it minimizes:
ϕ ( α )=f ( x 0 + α p 0 )
1 T T
Substituting f ( x )= x Qx−b x , derive the optimal α 0.
2
(RESPONSE):
(I ASKED IF THAT WAS ALL AND IT SAID,)
No, there’s much more! Let’s go step by step through the line search process
in detail.
We aim to minimize:
ϕ ( α )=f ( x 0 + α p 0 )
Step 1: Express x k+1 in Terms of α
Since x 0= [ 21] and p =[−8
0
−3 ]
, we write:
x ( α ) =x0 + α p0=
[ 21]+α [−8
−3 ] [ 1−3 α ]
=
2−8 α
Step 2: Compute ϕ ( α )=f ( x ( α ) )
1 T T
Since f ( x )= x Qx−b x , we expand:
2
1 T T
ϕ ( α )= ( x ( α ) ) Qx ( α )−b x ( α )
2
First Term: Quadratic Form Expansion
Expanding x ( α )T Qx ( α ):
T
x ( α ) Qx ( α )=( 2−8 α , 1−3 α )
[ 41 13 ][ 2−8
1−3 α ]
α
Performing the matrix-vector multiplication:
Qx ( α )=
[ 4 ( 2−8 α ) +1 ( 1−3 α )
1 ( 2−8 α )+ 3 (1−3 α ) ]
¿
[8−32
2−8 α +3−9 α ] [5−17 α ]
α +1−3 α
=
9−35 α
Now computing x ( α )T ( Qx ( α ) ):
[
( 2−8 α , 1−3 α ) ⋅ 9−35 α
5−17 α ]
¿ ( 2−8 α ) ( 9−35 α ) + ( 1−3 α ) ( 5−17 α )
Expanding:
2 2
¿ 18−70 α−72 α +280 α +5−17 α −15 α + 51 α
2
¿ 23−174 α + 331 α
Thus:
1 1 23 331 2
x ( α ) Qx ( α )= ( 23−174 α +331 α ) = −87 α +
T 2
α
2 2 2 2
Second Term: Linear Term Expansion
b x ( α )=[ 1 2 ]
T
[ 2−8
1−3 α ]
α
¿ 1 ( 2−8 α ) +2 (1−3 α )=2−8 α +2−6 α =4−14 α
Step 3: Derivative to Find α 0
23 331 2
ϕ ( α )= −87 α + α − ( 4−14 α )
2 2
23 331 2
¿ −87 α + α −4+ 14 α
2 2
15 331 2
¿ −73 α + α
2 2
Taking the derivative:
331
ϕ ' ( α )=−73+ ( 2 α )=−73+331 α
2
Setting ϕ ' ( α )=0:
331 α =73
73
α 0= ≈ 0.2205
331
Final Answer for Step Size:
α 0 ≈ 0.2205
Your Turn
Now update the iterate:
x 1=x 0 +α 0 p0
Substituting the values, compute x 1.
(RESPONSE):
We update the iterate:
x 1=x 0 +α 0 p0
Substituting the known values:
x 0=
[] [ ]
2
1
, p0 =
−8
−3
,α 0=
73
331
≈ 0.2205
Performing the element-wise update:
x 1=
[21]+ 0.2205[−8
−3 ]
¿
[2−0.2205
1−0.2205 ×3 ]
×8
¿
[1−0.6615
2−1.764
]
¿
[0.3385
0.236
]
Thus, the updated iterate is:
x 1=
[0.3385
0.236
]
Next Step: Compute Differences for the BFGS Update
Now, compute:
s0 =x1−x 0 , y 0=∇ f ( x 1 )−∇ f ( x 0 )
Since we already know x 0 and x 1, compute s0 first. Then, find ∇ f ( x 1 ) using the
gradient formula:
∇ f ( x )=Qx−b
Use this to compute y 0. Let’s see your result!
We now compute the differences required for the BFGS update.
Step 1: Compute s0
s0 =x1−x 0
Substituting the values:
s0 =
[ 0.3385
0.236
]−[ 21]
¿
[ ]
0.236−2
0.3385−1
¿
[−0.6615
−1.764
]
Step 2: Compute y 0=∇ f ( x 1 ) −∇ f ( x 0 )
We use the gradient formula:
∇ f ( x )=Qx−b
We already computed ∇ f ( x 0 )= [ 83]. Now compute ∇ f ( x ): 1
∇ f ( x 1 )=Q x 1−b
¿
[ 41 13][ 0.3385
0.236
]−[12]
First, perform the matrix-vector multiplication:
Q x1 =
[ 4 ( 0.236 ) +1 ( 0.3385 )
1 ( 0.236 )+3 ( 0.3385 ) ]
¿
[0.944
0.236+1.0155 ]
+0.3385
¿
[1.2825
1.2515 ]
Now subtract b :
∇ f ( x 1 )=
[1.2825
1.2515 ]−
[ 2]
1
¿
[1.2825−1
1.2515−2 ]
¿
[−0.7485
0.2825
]
Now, compute y 0:
y 0=∇ f ( x 1 ) −∇ f ( x 0 )
¿
[−0.7485
0.2825
]−[ 83]
¿
[0.2825−8
−0.7485−3 ]
¿
[−7.7175
−3.7485 ]
Final Results
[−0.6615
s0 =
−1.764
]
y=
[−7.7175
−3.7485 ]
0
(RESPONSE):
Next Step: Compute ρ0 and Update H 1
Now, compute:
1
ρ0 = T
y s0
0
H 1=( I −ρ0 s 0 y T0 ) H 0 ( I −ρ0 y 0 s T0 ) + ρ0 s 0 sT0
Let’s see your results!
You’re right! Let’s wrap it up properly.
Final Step: Compute ρ0 and Update H 1
Step 1: Compute ρ0
1
ρ0 = T
y s0
0
First, compute y T0 s0 :
T
y 0 s0 =(−7.7175 ,−3.7485 ) ⋅ (−1.764 ,−0.6615 )
¿ (−7.7175 ×−1.764 )+ (−3.7485×−0.6615 )
¿ 13.616+2.479
¿ 16.095
Thus:
1
ρ0 = ≈ 0.0621
16.095
Step 2: Compute H 1 using the BFGS update
H 1=( I −ρ0 s 0 y T0 ) H 0 ( I −ρ0 y 0 s T0 ) + ρ0 s 0 sT0
Since H 0=I , the formula simplifies to:
T T T
H 1=I −ρ 0 s0 y 0 −ρ0 y 0 s 0 + ρ0 s 0 s 0
Computing each term:
1. ρ0 s 0 y T0 and ρ0 y 0 s T0 :
T
ρ0 s 0 y 0 =0.0621
[−0.6615
−1.764
] [−7.7175 −3.7485 ]
¿ 0.0621
[ 13.616 6.609
5.103 2.480 ] =
[ 0.846
0.317
0.411
0.154 ]
Since ρ0 s 0 y T0 = ρ0 y 0 sT0 , we subtract twice this matrix from I :
I −2 ×
[0.846
0.317
0.411
0.154
=
][
1−1.692 −0.822
=
−0.692 −0.822
−0.634 1−0.308 −0.634 0.692 ][ ]
2. ρ0 s 0 s T0 :
T
[−0.6615
ρ0 s 0 s 0 =0.0621
−1.764
] [−1.764 −0.6615 ]
¿ 0.0621
[ 1.167
3.112 1.167
0.438 ] =
[ 0.193
0.073
0.073
0.027 ]
Adding to the previous result:
H 1=
[−0.692
−0.634 0.692 ] [ 0.073
−0.822 0.193
+
0.073
0.027 ]
¿
[−0.499
−0.561 0.719 ]
−0.749
Final Result:
The updated inverse Hessian approximation:
H 1=
[−0.499
−0.561
−0.749
0.719 ]
Final Thoughts
This completes the first iteration of the BFGS algorithm. In subsequent
iterations, we would:
Check for convergence.
Compute a new search direction.
Perform a line search.
Update x k , H k , and iterate again.