0% found this document useful (0 votes)

16 views33 pages

Lecture 18

Uploaded by

Yatharth Chawla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views33 pages

Lecture 18

Uploaded by

Yatharth Chawla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

MAT3007 Optimization

Optimality Conditions

Junfeng WU

School of Data Science

The Chinese University of Hong Kong, Shenzhen

1 / 33
Recap: Nonlinear Optimization

Some terminologies:
▶ Global vs local optimizer (minimizer)
▶ Gradient, Hessian, Taylor expansions

Then we studied the optimality conditions for unconstrained problems.

Theorem (First-Order Necessary Condition)

If x∗ is a local minimizer of f (·) for an unconstrained problem, then we
must have ∇f (x∗ ) = 0.

▶ The FONC can be used to find candidates for local minimizers

▶ However, FONC is not sufficient

2 / 33
Optimality Conditions for Unconstrained Problems-Continued

Optimality Conditions for Unconstrained Problems-Continued

3 / 33
.

Second-Order Necessary Conditions

4 / 33
Second-Order Necessary Condition

Consider the Taylor expansion again but to the 2nd order (assuming f is
twice continuously differentiable):
1
f (x + td) = f (x) + t∇f (x)⊤ d + t2 d⊤ ∇2 f (x)d + o(t2 ).
2
When the first-order necessary condition holds, we have:
1
f (x + td) = f (x) + t2 d⊤ ∇2 f (x)d + o(t2 ).
2
In order for x to be a local minimizer, we also need d⊤ ∇2 f (x)d to be
nonnegative for every d ∈ Rn .

5 / 33
Second-Order Necessary Condition (SONC)

Theorem: Second-Order Necessary Conditions

If x∗ is a local minimizer of f , then it holds that:
1. ∇f (x∗ ) = 0;
2. For all d ∈ Rn : d⊤ ∇2 f (x∗ )d ≥ 0.

Definition: Semidefiniteness
We call a (symmetric) matrix A positive (negative) semidefinite (PSD/NSD)
if and only if for all x we have x⊤ Ax ≥ 0 (≤ 0).

Remark:
▶ Therefore, the second-order necessary condition requires the Hessian
matrix at x∗ to be PSD. In the one-dimensional case, this is
equivalent to f ′′ (x∗ ) ≥ 0.

6 / 33
Positive Semidefinite Matrices

Here are some useful facts about PSD matrices:

▶ We usually only talk about PSD properties for symmetric matrices.
▶ If a matrix A is not symmetric, we use 12 (A + A⊤ ) to define the PSD
properties (because x⊤ Ax = 12 x⊤ (A + A⊤ )x).
▶ A symmetric matrix is PSD if and only if all the eigenvalues are
nonnegative.
▶ For any matrix A, A⊤ A is a (symmetric) PSD matrix.

7 / 33
Example Continued

For f (x) := x4 − 9x2 + 4x − 1, the second-order condition is:

f ′′ (x) = 12x2 − 18 ≥ 0
√
Only x1 = −1
√ − 6/2 and x3 ′′= 2 satisfy the condition.
√ But for the point
x2 = −1 + 6/2, we obtain f (x2 ) = 12(1 − 6) < 0 (thus, x2 is not a
local minimizer).

In the example of least squares problem, we use the following fact:

▶ If f (x) = x⊤ M x (M is symmetric), then ∇2 f (x) = 2M .

Therefore, the Hessian matrix in that problem is 2X ⊤ X, which is always a

PSD matrix. Therefore, the SONC always holds!

8 / 33
SONC is Not Sufficient

However, even if both the first- and second-order necessary conditions

hold, we still can not guarantee that the candidate is a local minimum!

Example: Consider f (x) = x3 at 0.

▶ f ′ (0) = f ′′ (0) = 0, thus FONC and SONC hold.
▶ But 0 is not a local minimum

▶ A point x satisfying ∇f (x) = 0 is called critical point or stationary

point.
▶ The SONC can be used to verify that a stationary point is not a local
minimizer.
⇝ By modifying the SONC, we can get a sufficient condition.

9 / 33
.

Second-Order Sufficient Conditions

10 / 33
Second-Order Sufficient Condition (SOSC)

Theorem: Second-Order Sufficient Conditions

Let f be twice continuously differentiable. If x∗ satisfies:
1. ∇f (x∗ ) = 0;
2. For all d ∈ Rn \{0}: d⊤ ∇2 f (x∗ )d > 0;
then x∗ is a strict local minimum/minimizer of f .

Definition: Definite Matrices

We call a (symmetric) matrix A positive (negative) definite (PD/ND) if and
only if for all x ̸= 0: x⊤ Ax > 0 (< 0).

▶ A PD matrix must be PSD (thus PD is a stronger notion).

▶ A symmetric matrix is PD ⇐⇒ all its eigenvalues are positive.

11 / 33
Proof
We need the following lemma
Lemma: Bounds and Eigenvalues
Let A ∈ Rn×n be a symmetric matrix. Then
λmin (A)∥x∥2 ≤ x⊤ Ax ≤ λmax (A)∥x∥2 ∀ x ∈ Rn ,
where λmin (A) and λmax (A) are the smallest and largest EV of A.
The proof is by another variant of Taylor expansion, i.e.,
1
f (x∗ + d) = f (x∗ ) + d⊤ ∇2 f (x∗ )d + o(∥d∥2 ),
2
for d tends to 0.
When ∇2 f (x∗ ) is positive definite, we have d⊤ ∇2 f (x∗ )d ≥ µ||d||2 , where
µ > 0 is the smallest eigenvalue of ∇2 f (x∗ ).

Thus, we have
µ o(∥d∥2 )

µ
f (x∗ + d) ≥ f (x∗ ) + ∥d∥2 + o(∥d∥2 ) = f (x∗ ) + ∥d∥2 + .
2 2 ∥d∥2
o(∥d∥2 )
Since ∥d∥ → 0, we can have ∥d∥2 ≥ − µ4 , which shows
f (x∗ + d) > f (x∗ ).
12 / 33
For Maximization Problems

Our conditions are derived for minimization problems. For maximization

problems, we just change the inequalities. Let f ∈ C 2 (twice continuously
differentiable).
Theorem: FONC for Maximization
If x∗ is a local (unconstrained) maximizer of f , then we must have ∇f (x∗ ) =
0.

Theorem: SONC for Maximization

If x∗ is a local maximizer of f , then we must have 1.) ∇f (x∗ ) = 0; 2.)
∇2 f (x∗ ) is negative semidefinite.

Theorem: SOSC for Maximization

If x∗ satisfies 1.) ∇f (x∗ ) = 0; 2.) ∇2 f (x∗ ) is negative definite, then x∗ is
a strict local maximizer.

13 / 33
Optimality Conditions

Optimality Conditions for Unconstrained Problems:

▶ First-order necessary condition.
▶ Second-order necessary condition.
▶ Second-order sufficient condition.

In many cases, we can utilize these conditions to identify local and global
optimal solutions.

General Strategy:
▶ Use FONC and SONC to identify all possible candidates. Then, use
the sufficient conditions to verify.
▶ If a problem only has one stationary point and one can reason that
the problem must have a finite optimal solution, then this point must
be the (global) optimum.

14 / 33
Examples–I

In the example f (x) = x4 − 9x2 + 4x − 1, the points x1 and x3 satisfy the

second-order sufficient conditions (f ′′ (x) > 0) and are local minimizer.

In the least squares problem, if X ⊤ X is positive definite (or if it is

invertible), then the solution β of the FONC

X ⊤ Xβ = X ⊤ y

is unique and it satisfies the second-order sufficient conditions.

⇝ It must be the unique global minimizer of the problem.

15 / 33
Optimality Conditions for Unconstrained Problems-Continued

Optimality Conditions for Unconstrained Problems-Continued

16 / 33
Constrained Problems
We have derived necessary and sufficient conditions for the local minimum
for unconstrained problems.
▶ What is the difference between constrained and unconstrained
problems?
Consider the example f (x) = 100x2 (1 − x)2 − x with constraint
−0.2 ≤ x ≤ 0.8.

In addition to the original local minimizer (x1 = 0.013), there is one more
local minimizer on the boundary (x = 0.8).

17 / 33
Constrained Problems

At the boundary (x∗ = 0.8), the FONC is not satisfied

f ′ (0.8) < 0

However, at this point, in order to stay feasible, we can only go leftward.

That is, in the Taylor expansion

f (x∗ + d) = f (x∗ ) + df ′ (x∗ ) + o(d)

we can only take d to be negative (otherwise it won’t be feasible).

Thus f (x∗ + d) > f (x∗ ) in a small neighborhood of x∗ in the feasible

region. Thus x∗ is a local minimizer.

18 / 33
Feasible Directions

Now we formalize the above arguments.

Definition (Feasible Direction)

Given x ∈ F , we call d to be a feasible direction at x if there exists ᾱ > 0
such that x + αd ∈ F for all 0 ≤ α ≤ ᾱ.

For example,
▶ If F = {x|Ax = b}, then the feasible directions at x is {d|Ad = 0}
▶ If F = {x|Ax ≥ b}, then the feasible directions at x is
{d|aTi d ≥ 0 if aTi x = bi }

19 / 33
FONC for Constrained Problems

Theorem (FONC for Constrained Problems)

If x∗ is a local minimum, then for any feasible direction d at x∗ , we must
have ∇f (x∗ )T d ≥ 0

In unconstrained problems, all directions are feasible, thus we must have

∇f (x∗ ) = 0.

20 / 33
An Alternative View

Definition (Descent Direction)

Let f be continuously differentiable. Then d is called a descent direction
at x if and only if ∇f (x)T d < 0.

⇝ If d is a descent direction at x, then there exists γ̄ > 0 such that

f (x + γd) < f (x) for all 0 < γ ≤ γ̄.

If we denote the set of feasible directions at x by SF (x) and the set of

descent directions at x by SD (x). Then the first order necessary condition
can be written as:

SF (x∗ ) ∩ SD (x∗ ) = ∅

Or in other words, there cannot be any feasible descent directions.

21 / 33
Nonlinear Optimization with Equality Constraints

Consider
minimizex f (x)
s.t. Ax = b
▶ The feasible direction set is {d|Ad = 0}.
▶ The descent direction set is {d|∇f (x)T d < 0}.

The FONC says that at local minimum, there cannot be a solution to both
systems (feasible and descent direction)

Theorem (Alternative System)

The system Ad = 0 and ∇f (x)T d < 0 does not have a solution if and
only if there exists y such that

AT y = ∇f (x)

22 / 33
Nonlinear Optimization with Equality Constraints

Therefore, the first-order necessary condition for

minimizex f (x) (1)

s.t. Ax = b

is that there exists y such that

AT y = ∇f (x)

Theorem
If x∗ is a local minimum for (1), then there must exist y such that

AT y = ∇f (x∗ )

23 / 33
Proof

First it is easy to see that if there exists y such that AT y = ∇f (x). Then
we can’t have a d such that Ad = 0 and ∇f (x)T d < 0 (multiplying dT to
both sides of the equation will reach a contradiction).

To prove the reverse, consider the LP:

minimized ∇f (x)T d
s.t. Ad = 0

If there doesn’t exist d satisfying Ad = 0 and ∇f (x)T d < 0, then the

optimal value of this LP must be 0.
Therefore, by the strong duality theorem, its dual problem must also be
feasible (and the optimal value is 0). However, the dual constraint is
AT y = ∇f (x). Thus the theorem is proved. □

24 / 33
Example
Consider the problem:
minimize (x1 − 1)2 + (x2 − 1)2
s.t. x1 + x2 = 1

▶ This problem finds the nearest point on the line x1 + x2 = 1 to the

point (1, 1)

Figure: Finding the nearest point on the line to (1,1)

25 / 33
Example Continued

By the FONC, x = (x1 , x2 ) is a local minimizer if there exists y such that

AT y = ∇f (x)

Here A = (1, 1). And ∇f (x) = (2x1 − 2; 2x2 − 2).

Thus it means there exists y such that

2x1 − 2 = y 2x2 − 2 = y

Also combined with the constraint x1 + x2 = 1. We have

x1 = x2 = 1/2

is the only candidate for local minimum. And it is indeed a local minimizer
(also a global minimizer)

26 / 33
Another Example

Consider a constrained version of the least squares problem:

minimizeβ ||Xβ − y||22

s.t. Wβ = ξ

The gradient is 2(X T Xβ − X T y).

Therefore, the FONC is that there exists z such that

W T z = 2(X T Xβ − X T y)

Therefore, an optimal β must satisfy:

1 T
W β = ξ, X T Xβ = W z + XT y
2

27 / 33
Another Example Continued

1 T
W β = ξ, X T Xβ = W z + XT y
2
We can write this as:

W 0 β ξ
=
XT X − 21 W T z XT y

Here the size of X be m × n , and the size of W be d × n. Then these are

n + d linear equations with n + d unknowns.

This is a system of linear equations with n + d equations and n + d

unknowns. Solving this equation will yield the unique candidate for local
minimizer (provided the left hand side matrix is of full rank).

28 / 33
Inequality Constraints

Now we consider an inequality constrained problem:

minimizex f (x)
s.t. Ax ≥ b (2)

What should be the necessary optimality conditions?

Theorem
If x∗ is a local minimum of (2), then there exists some y ≥ 0 satisfying

∇f (x∗ ) = AT y
yi · (aTi x∗ − bi ) = 0, ∀i

where aTi is the ith row of A.

29 / 33
Proof

We consider the descent directions and the feasible directions at x∗ .

First it is easy to see that the descent directions are:

SD (x∗ ) = {d : ∇f (x∗ )T d < 0}

For the feasible directions, it is

SF (x∗ ) = {d : aTi d ≥ 0, if aTi x∗ = bi }

Local optimality requires that SD (x∗ ) ∩ SF (x∗ ) = ∅. We define

A(x) = {i : aTi x = bi } to be the active constraints at x, then the
necessary condition should be:

There does not exist d such that

1. ∇f (x∗ )T d < 0
2. aTi d ≥ 0 for i ∈ A(x∗ )

30 / 33
Proof Continued

The nonexistence of d such that

1. ∇f (x)T d < 0
2. aTi d ≥ 0 for i ∈ A(x)
is equivalent to the existence of y ≥ 0, such that
X
∇f (x) = ai yi
i∈A(x)

This can be further written as the following conditions:

▶ There exists y ≥ 0 such that

∇f (x) = AT y
yi · (aTi x − bi ) = 0, ∀i

31 / 33
More General Cases — KKT Conditions

We have discussed cases with linear equality constraints or linear inequality

constraints and derived the (necessary) optimality conditions
▶ We want to extend them to more general cases — KKT conditions
▶ We call the first-order necessary conditions for a general optimization
problem the KKT conditions
▶ Solutions that satisfy the KKT conditions are called KKT points.
▶ KKT points are candidate points for local optimal solutions.
▶ The KKT conditions were originally named after H. Kuhn and A.
Tucker, who first published the conditions in 1951. Later scholars
discovered that the conditions had been stated by W. Karush in his
master’s thesis in 1939.

32 / 33
Find KKT Conditions

We consider the general nonlinear optimization problem:

minimizex f (x)
s.t. gi (x) ≥ 0 i = 1, ..., m
hi (x) = 0 i = 1, ..., p
ℓi (x) ≤ 0 i = 1, ..., r
xi ≥ 0 i∈M
xi ≤ 0 i∈N
xi free i∈
/ M ∪N

One can use the feasible/descent directions arguments to find the KKT
conditions. But it is not very convenient.
▶ In the next lecture, we present a direct approach

33 / 33

Chen-To Tai - Dyadic Green Functions in Electromagnetic Theory (1993, IEEE PRESS) PDF
No ratings yet
Chen-To Tai - Dyadic Green Functions in Electromagnetic Theory (1993, IEEE PRESS) PDF
358 pages
E1 251 Linear and Nonlinear Op2miza2on: Chapter 3: Condi/ons of Maxima and Minima
No ratings yet
E1 251 Linear and Nonlinear Op2miza2on: Chapter 3: Condi/ons of Maxima and Minima
27 pages
Chapter 6 Lecture Notes
No ratings yet
Chapter 6 Lecture Notes
4 pages
Lecture When Is A Function Convex Hessian Positive Definite
No ratings yet
Lecture When Is A Function Convex Hessian Positive Definite
7 pages
Optimization Lecture Notes
No ratings yet
Optimization Lecture Notes
13 pages
Optimization 3
No ratings yet
Optimization 3
30 pages
Nonlinear Programming Essentials
No ratings yet
Nonlinear Programming Essentials
12 pages
Optimumengineeringdesign Day3b
No ratings yet
Optimumengineeringdesign Day3b
32 pages
Math Chapter 7
No ratings yet
Math Chapter 7
4 pages
Nocedal - Wright CH - 02-01
No ratings yet
Nocedal - Wright CH - 02-01
9 pages
Mcnotes 41
No ratings yet
Mcnotes 41
8 pages
Optimization Basics & Least Squares
No ratings yet
Optimization Basics & Least Squares
16 pages
Unconstrained Nonlinear Optimization Guide
No ratings yet
Unconstrained Nonlinear Optimization Guide
23 pages
Chương 6 Tối Ưu Không Ràng Buộc
No ratings yet
Chương 6 Tối Ưu Không Ràng Buộc
22 pages
Optimization PDF
No ratings yet
Optimization PDF
13 pages
Optimization Notes2
No ratings yet
Optimization Notes2
14 pages
Lec 17 Multivariable OT
No ratings yet
Lec 17 Multivariable OT
30 pages
Optimality Conditions: Mar Ia M. Seron
No ratings yet
Optimality Conditions: Mar Ia M. Seron
58 pages
04 Nonlinear Systems and Optimization
No ratings yet
04 Nonlinear Systems and Optimization
74 pages
Positive and Negative Definite Matrices and Optimization
No ratings yet
Positive and Negative Definite Matrices and Optimization
7 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
Opt7 20
No ratings yet
Opt7 20
8 pages
Ma321 Slides8
No ratings yet
Ma321 Slides8
19 pages
Unconstrained Optimization 2011
No ratings yet
Unconstrained Optimization 2011
6 pages
Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
No ratings yet
Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
20 pages
Dssm-U5 MHK
No ratings yet
Dssm-U5 MHK
51 pages
Chapter 4. Optimization
No ratings yet
Chapter 4. Optimization
62 pages
Unconstrained and Constrained Optimization
No ratings yet
Unconstrained and Constrained Optimization
31 pages
Nonlinear Optimization Basics
No ratings yet
Nonlinear Optimization Basics
103 pages
Nonlinear Optimization Basics
No ratings yet
Nonlinear Optimization Basics
6 pages
Chapter 2 - Unconstrained Optimization
No ratings yet
Chapter 2 - Unconstrained Optimization
20 pages
Why Do Local Methods Solve Nonconvex Problems
No ratings yet
Why Do Local Methods Solve Nonconvex Problems
19 pages
NLP Slides
No ratings yet
NLP Slides
201 pages
5 Optimization Techniques
No ratings yet
5 Optimization Techniques
40 pages
Optimality Conditions
No ratings yet
Optimality Conditions
10 pages
Rec 1
No ratings yet
Rec 1
2 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
46 pages
Unconstrained Optimization
No ratings yet
Unconstrained Optimization
5 pages
Unconstrained Optimization Lecture
No ratings yet
Unconstrained Optimization Lecture
28 pages
Lect5 Removed
No ratings yet
Lect5 Removed
35 pages
Optimization: Dixit 1990 Simon and Blume 1994 Carter 2001 de La Fuente 2000
No ratings yet
Optimization: Dixit 1990 Simon and Blume 1994 Carter 2001 de La Fuente 2000
25 pages
General Condition For Solving Optimization Problem
No ratings yet
General Condition For Solving Optimization Problem
42 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
WINSEM2020-21 EEE1020 ETH VL2020210500427 Reference Material I 10-Feb-2021 1 Classical Opt Basics Unconstr Constr Uploaded
No ratings yet
WINSEM2020-21 EEE1020 ETH VL2020210500427 Reference Material I 10-Feb-2021 1 Classical Opt Basics Unconstr Constr Uploaded
14 pages
Mathematics For Economics (ECON 104)
No ratings yet
Mathematics For Economics (ECON 104)
51 pages
Math For Econ (MIT)
No ratings yet
Math For Econ (MIT)
8 pages
Cost Function
No ratings yet
Cost Function
36 pages
Introduction To Optimization
No ratings yet
Introduction To Optimization
18 pages
Optimization Theory 2
No ratings yet
Optimization Theory 2
27 pages
Or Obsidian
No ratings yet
Or Obsidian
6 pages
Advanced LMI Techniques for Engineers
No ratings yet
Advanced LMI Techniques for Engineers
16 pages
OPTIMIZATION Lecture
No ratings yet
OPTIMIZATION Lecture
88 pages
Nonlinear Program
No ratings yet
Nonlinear Program
13 pages
Algo IMP Unit 1
No ratings yet
Algo IMP Unit 1
8 pages
Advanced Constrained Optimization
No ratings yet
Advanced Constrained Optimization
22 pages
O4MD 02 Foundations
No ratings yet
O4MD 02 Foundations
8 pages
Cheatsheet
No ratings yet
Cheatsheet
2 pages
XI - Maths - Matrices
No ratings yet
XI - Maths - Matrices
50 pages
Exercise 9 Paranjoy Sourish-24
No ratings yet
Exercise 9 Paranjoy Sourish-24
19 pages
GRON
No ratings yet
GRON
12 pages
Quantum Mechanics Concepts
No ratings yet
Quantum Mechanics Concepts
19 pages
Jr. Maths-1A Foundation Revision
No ratings yet
Jr. Maths-1A Foundation Revision
3 pages
Week11 Notes
No ratings yet
Week11 Notes
19 pages
405 Final Solutions
No ratings yet
405 Final Solutions
3 pages
Assignment#25 - Trigonometry
No ratings yet
Assignment#25 - Trigonometry
2 pages
ZCT 219 Chapter 4 PDF
No ratings yet
ZCT 219 Chapter 4 PDF
21 pages
Precalculus Q2 Trigonometry Lesson8 Circular Functions
No ratings yet
Precalculus Q2 Trigonometry Lesson8 Circular Functions
13 pages
Mathematics Question Paper
No ratings yet
Mathematics Question Paper
5 pages
Advanced Analytic Combinatorics
No ratings yet
Advanced Analytic Combinatorics
25 pages
1939 Carslaw & Jaeger - On Green's Functions in The Theory of Heat Conduction
No ratings yet
1939 Carslaw & Jaeger - On Green's Functions in The Theory of Heat Conduction
7 pages
Hitotumatu 1988
No ratings yet
Hitotumatu 1988
10 pages
MIT EECS 6.837: Transformations Guide
No ratings yet
MIT EECS 6.837: Transformations Guide
42 pages
1 - Relations and Functions
No ratings yet
1 - Relations and Functions
18 pages
AM Syllabus Mumbai University BSC It
No ratings yet
AM Syllabus Mumbai University BSC It
2 pages
Group Isomorphism
No ratings yet
Group Isomorphism
4 pages
Section 3.2 Notes
No ratings yet
Section 3.2 Notes
4 pages
Euler Angles & Matrix Rotations
100% (7)
Euler Angles & Matrix Rotations
4 pages
Linear Algebra and Polynomial Assignment
No ratings yet
Linear Algebra and Polynomial Assignment
2 pages
Textbook Mathematics
No ratings yet
Textbook Mathematics
58 pages
DFT Matrix Calculation Lecture
No ratings yet
DFT Matrix Calculation Lecture
15 pages
Orthogonal Polynomials Guide
No ratings yet
Orthogonal Polynomials Guide
54 pages
INTEGRAL CALCULUS Handouts
No ratings yet
INTEGRAL CALCULUS Handouts
7 pages
WS C-15 1 and 2
No ratings yet
WS C-15 1 and 2
166 pages
MAT Preparation Tri
No ratings yet
MAT Preparation Tri
10 pages
Laplace 2
No ratings yet
Laplace 2
12 pages
Maths I
No ratings yet
Maths I
2 pages

Lecture 18

Uploaded by

Lecture 18

Uploaded by

MAT3007 Optimization

School of Data Science

Then we studied the optimality conditions for unconstrained problems.

Theorem (First-Order Necessary Condition)

▶ The FONC can be used to find candidates for local minimizers

Optimality Conditions for Unconstrained Problems-Continued

Second-Order Necessary Conditions

Theorem: Second-Order Necessary Conditions

Here are some useful facts about PSD matrices:

For f (x) := x4 − 9x2 + 4x − 1, the second-order condition is:

In the example of least squares problem, we use the following fact:

Therefore, the Hessian matrix in that problem is 2X ⊤ X, which is always a

However, even if both the first- and second-order necessary conditions

Example: Consider f (x) = x3 at 0.

▶ A point x satisfying ∇f (x) = 0 is called critical point or stationary

Second-Order Sufficient Conditions

Theorem: Second-Order Sufficient Conditions

Definition: Definite Matrices

▶ A PD matrix must be PSD (thus PD is a stronger notion).

Our conditions are derived for minimization problems. For maximization

Theorem: SONC for Maximization

Theorem: SOSC for Maximization

Optimality Conditions for Unconstrained Problems:

In the example f (x) = x4 − 9x2 + 4x − 1, the points x1 and x3 satisfy the

In the least squares problem, if X ⊤ X is positive definite (or if it is

is unique and it satisfies the second-order sufficient conditions.

Optimality Conditions for Unconstrained Problems-Continued

At the boundary (x∗ = 0.8), the FONC is not satisfied

However, at this point, in order to stay feasible, we can only go leftward.

f (x∗ + d) = f (x∗ ) + df ′ (x∗ ) + o(d)

we can only take d to be negative (otherwise it won’t be feasible).

Thus f (x∗ + d) > f (x∗ ) in a small neighborhood of x∗ in the feasible

Now we formalize the above arguments.

Definition (Feasible Direction)

Theorem (FONC for Constrained Problems)

In unconstrained problems, all directions are feasible, thus we must have

Definition (Descent Direction)

⇝ If d is a descent direction at x, then there exists γ̄ > 0 such that

If we denote the set of feasible directions at x by SF (x) and the set of

Or in other words, there cannot be any feasible descent directions.

Theorem (Alternative System)

Therefore, the first-order necessary condition for

minimizex f (x) (1)

is that there exists y such that

To prove the reverse, consider the LP:

If there doesn’t exist d satisfying Ad = 0 and ∇f (x)T d < 0, then the

▶ This problem finds the nearest point on the line x1 + x2 = 1 to the

Figure: Finding the nearest point on the line to (1,1)

By the FONC, x = (x1 , x2 ) is a local minimizer if there exists y such that

Here A = (1, 1). And ∇f (x) = (2x1 − 2; 2x2 − 2).

Also combined with the constraint x1 + x2 = 1. We have

Consider a constrained version of the least squares problem:

minimizeβ ||Xβ − y||22

The gradient is 2(X T Xβ − X T y).

Therefore, the FONC is that there exists z such that

Therefore, an optimal β must satisfy:

Here the size of X be m × n , and the size of W be d × n. Then these are

This is a system of linear equations with n + d equations and n + d

Now we consider an inequality constrained problem:

What should be the necessary optimality conditions?

where aTi is the ith row of A.

We consider the descent directions and the feasible directions at x∗ .

SD (x∗ ) = {d : ∇f (x∗ )T d < 0}

For the feasible directions, it is

SF (x∗ ) = {d : aTi d ≥ 0, if aTi x∗ = bi }

Local optimality requires that SD (x∗ ) ∩ SF (x∗ ) = ∅. We define

There does not exist d such that

The nonexistence of d such that

This can be further written as the following conditions:

We have discussed cases with linear equality constraints or linear inequality

We consider the general nonlinear optimization problem:

You might also like