0% found this document useful (0 votes)

25 views6 pages

Lecture 1 2 Background

Uploaded by

drbaskerphd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views6 pages

Lecture 1 2 Background

Uploaded by

drbaskerphd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

Lecture 1–2: Optimization Background

Yudong Chen

1 Introduction
Our standard optimization problem
min f ( x ) (P)
x ∈X

• x: a vector, optimization/decision variable

• X : feasible set

• f ( x ) objective function, real-valued

• maxx f ( x ) ⇐⇒ minx − f ( x )

The (optimal) value of (P):

val(P) = inf f ( x ).
x ∈X

To fully specify (P), we need to specify

• vector space, feasible set, objective function;

• what it means to solve (P).

1.1 Can we even hope to solve an arbitrary optimization problem?

Example 1. Suppose we want to find positive integers x, y, z satisfying

x 3 + y3 = z3 .

Can be formulated as a (continuous) optimization problem (PF ):

min ( x n + yn − zn )2
x,y,z,n

s.t.x ≥ 1, y ≥ 1, z ≥ 1, n ≥ 3 (PF )
sin2 (πn) + sin2 (πx ) + sin2 (πy) + sin2 (πz) = 0.

If we could certify whether val(PF ) ̸= 0, we would have found a proof for Fermat’s Last theorem
(1637):

For any n ≥ 3, x n + yn = zn has no solutions over positive integers.

Proved by Andrew Wiles in 1994.

1
UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

Example 2. Unconstrained optimization, many local minima:1

We cannot hope for solving an arbitrary optimization problem.

We need some structure.

2 Specifying the optimization problem

2.1 Vector space
This is where the optimization variable and the feasible set live.
(Rd , ∥·∥): normed vector space, “primal space”.
• The variable x is a (column) vector in Rd .
 
x1
 x2 
x =  . .
 
 .. 
xd

• The norm tells us how to measure distances in Rd .

1/2
Most often, we will take ∥ x ∥ = ∥ x ∥2 = ∑id=1 xi2 (Euclidean norm)
1/p
We sometimes also consider ℓ p norm ∥ x ∥ p = ∑id=1 | xi | p ,p≥1

• ∥ x ∥1 = ∑ i | x i |,

• ∥ x ∥∞ = max1≤i≤d | xi |.

(Plots of unit balls of ℓ2 , ℓ1 , ℓ∞ norms.)

1 Left:
plot by Jelena Diakonikolas. Right: loss surfaces of ResNet-56 without skip connections (https://arxiv.
org/pdf/1712.09913.pdf).

2
UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

We will use ⟨·, ·⟩ to denote inner products. Standard inner product

d
⟨ x, y⟩ = x ⊤ y = ∑ xi yi .
i =1

When we work with Rd , ∥·∥ p , view ⟨y, x ⟩ as the value of a linear function y at x. So, if we
are measuring the length of x using the ∥·∥ p , we should measure the length of y using ∥·∥q ,where
1 1
p + q = 1.
Definition 1 (Dual norm). The dual norm of ∥·∥ is given by

∥z∥∗ := sup ⟨z, x ⟩ .

∥ x ∥≤1

From the definition we immediately have the

Proposition 1 (Holder Inequality). For all z, y ∈ Rd :

|⟨z, x ⟩| ≤ ∥z∥∗ · ∥ x ∥ .
x
Proof. Fix any two vectors x, z. Assume x ̸= 0, z ̸= 0, o.w. trivial. Define x̂ = ∥x∥
. Then

⟨z, x ⟩
∥z∥∗ ≥ ⟨z, x̂ ⟩ =
∥x∥
and hence ⟨z, x ⟩ ≤ ∥z∥∗ · ∥ x ∥. Applying same argument with x replaced by − x proves − ⟨z, x ⟩ ≤
∥ z ∥ ∗ · ∥ x ∥.
1 1
Example 3. ∥·∥ p and ∥·∥q are duals when p + q = 1. In particular, ∥·∥2 is its own dual; ∥·∥1 and
∥·∥∞ are dual to each other.
In Rd , all ℓ p norms are equivalent. In particular,
1 1
−
∀ x ∈ Rd , p ≥ 1, r > p : ∥ x ∥r ≤ ∥ x ∥ p ≤ d p r ∥ x ∥r .
However, choice of norm affects how algorithm performance depends on dimension d.

2.2 Feasible set

The feasible set
X ⊆ Rd
specifies what solution points we are allowed to output.
If X = Rd , we say that (P) is unconstrained. Otherwise we say that (P) is constrained.
X can be specified:
• as an abstract geometric body (a ball, a box, a polyhedron, a convex set)
• via functional constraints:

gi ( x ) ≤ 0, i = 1, 2, . . . , m,
hi ( x ) = 0, i = 1, . . . , p

Note that f i ( x ) ≥ C is equivalent to taking gi ( x ) = C − f i ( x ).

3
UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

Example 4.

X = B2 (0, 1) = unit Euclidean ball

X = { x ∈ Rd : ∥ x ∥ 2 ≤ 1 }

In this class, we will always assume that X is closed.

Hein-Borel Theorem: X ⊆ Rd is closed and bounded if and only if it is compact (if X ⊂ α∈ A Uα

for some family of open sets {Uα } ,then there there exists a finite subfamily {Uαi }in=1 such that
X ⊆ 1≤i≤n Uαi .)
S

Weierstrass Extreme Value Theorem: If X is compact and f is a function that is defined and
continuous on X , then f attains its extreme values on X .
What if X is not bounded? Consider f ( x ) = e x . Then infx∈R f ( x ) = 0, but not attained.
When we work with unconstrained problems, we will normally assume that f is bounded
below.

Convex sets: Except for some special cases, we often assume that the feasible set is convex, so
that we will be able to guarantee tractability.

Definition 2 (Convex set). A set X ⊆ Rd is convex if

∀ x, y ∈ X , ∀α ∈ (0, 1) : (1 − α) x + αy ∈ X

A picture.

We cannot hope to deal with arbitrary nonconvex constraints. E.g., xi (1 − xi ) = 0 ⇐⇒ xi ∈

{0, 1}, integer programs.

2.3 Objective function

“cost”, “loss”
Extended real valued functions:

f : D → R ∪ {−∞, ∞} ≡ R̄.

Here f is defined on D ⊆ Rd . Can extend the definition of f to all of Rd by assigning the value
+∞ at each point x ∈ Rd \ D .
Effective domain: n o
dom( f ) = x ∈ Rd : f ( x ) < ∞

In the sequel, domain means effective domain.

“Linear and nonlinear optimization” ≈ “continuous optimization” (as contrast to discrete/combinatorial

optimization)

4
UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

2.3.1 Lower semicontinuous functions

We mostly assume f to be continuous, which can be relaxed slightly.
Definition 3. A function f : Rd → R̄ is said to be lower semicontinuous (l.s.c) at x ∈ Rd if

f ( x ) ≤ lim inf f (y).

y→ x

We way f is l.s.c. on Rd if it is l.s.c. at every point x ∈ Rd .

This definition is mainly useful for allowing indicator functions.

Example 5. Verify yourself: Indicator of a closed set is l.s.c.
(
0, x ∈ X
IX ( x ) =
∞, x ∈ / X.

Using IX we can write

min f ( x ) ≡ min { f ( x ) + IX ( x )} ,
x ∈X x ∈Rd
thereby unifying constrained and unconstrained optimization.

2.3.2 Continuous and smooth functions

Unless we are abstracting away constraints, the least we will assume about f is that it is continu-
ous.
Sometimes we consider stronger assumptions.
Definition 4. f : Rd → R̄ is said to be
1. Lipschitz-continuous on X ⊆ Rd (w.r.t. the norm ∥·∥) if there exists M < ∞ such that

∀ x, y ∈ X : | f ( x ) − f (y)| ≤ M ∥ x − y∥ .

2. Smooth on X ⊆ Rd (w.r.t. the norm ∥·∥) if f ’s gradient are Lipschitz-continuous, i.e., there
exists L < ∞ such that2

∀ x, y ∈ X : ∥∇ f ( x ) − ∇ f (y)∥∗ ≤ L ∥ x − y∥ .
 ∂f 
 ∂x. 1 
(Gradient: ∇ f ( x ) =  . 
 .  .)
∂f
∂xd
2 This definition can be viewed a quantitative version of C1 -smoothness.

5
UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

• Picture:

In Rd , Lipschitz-continuity in some norm implies the same for every other norm, but M may differ.

Example 6. f ( x ) = 12 ∥ x ∥22 is 1-smooth on R2 w.r.t. ∥·∥2 . The log-sum-exp (or softmax) function
f ( x ) = log ∑id=1 exp( xi ) is 1-smooth on Rd w.r.t. ∥·∥∞ .

Example 7. Function that is continuously differentiable on its domain but not smooth:

1
f (x) =
x
dom( f ) = R++

2.3.3 Convex functions

Definition 5. f : Rd → R̄ is convex if ∀ x, y ∈ Rd , ∀α ∈ (0, 1) :

f ((1 − α) x + αy) ≤ (1 − α) f ( x ) + α f (y).

A picture.

Lemma 1. f : Rd → R is convex if and only its epigraph

n o
epi( f ) := ( x, a) : x ∈ Rd , a ∈ R, f ( x ) ≤ a

is convex.

Proof. Follows from definitions. Left as exercise.

Definition 6. We say that a function f : Rd → R̄ is proper if ∃ x ∈ Rd s.t. f ( x ) ∈ R.

Lemma 2. If f : Rd → R̄ is proper and convex, then dom( f ) is convex.

Convex Functions and Their Applications PDF
100% (2)
Convex Functions and Their Applications PDF
44 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
181 pages
Electronic Companion of A Unified Theory of DRO
No ratings yet
Electronic Companion of A Unified Theory of DRO
37 pages
Convex Duality Cond Enced
No ratings yet
Convex Duality Cond Enced
57 pages
Proximal Gradient Methods For Machine
No ratings yet
Proximal Gradient Methods For Machine
96 pages
Opt Cours
No ratings yet
Opt Cours
67 pages
Opte - Optimization
No ratings yet
Opte - Optimization
125 pages
Curs Tehnici de Optimizare
No ratings yet
Curs Tehnici de Optimizare
141 pages
Linear Matrix Inequalities in Control
No ratings yet
Linear Matrix Inequalities in Control
60 pages
Lec 1
No ratings yet
Lec 1
60 pages
Intro 2 ML
No ratings yet
Intro 2 ML
162 pages
Mclas Tema1 v2
No ratings yet
Mclas Tema1 v2
74 pages
Chap04 ConvexOptimizationBasics
No ratings yet
Chap04 ConvexOptimizationBasics
29 pages
Lecture 7
No ratings yet
Lecture 7
46 pages
BasicsOfConvexOptimization PDF
No ratings yet
BasicsOfConvexOptimization PDF
142 pages
Convex Functions Explained
No ratings yet
Convex Functions Explained
36 pages
Opt - Lecture 2
No ratings yet
Opt - Lecture 2
23 pages
Classification of Optimization Methods
No ratings yet
Classification of Optimization Methods
68 pages
Convexity in Optimization: Key Concepts
No ratings yet
Convexity in Optimization: Key Concepts
13 pages
Lecture Notes PDF
No ratings yet
Lecture Notes PDF
143 pages
Optimization Lecture Notes
No ratings yet
Optimization Lecture Notes
13 pages
Section05 Solutions
No ratings yet
Section05 Solutions
5 pages
Convex Optimization
No ratings yet
Convex Optimization
108 pages
Convex Analysis Fundamentals
No ratings yet
Convex Analysis Fundamentals
12 pages
Convex Optimization Tutorial
No ratings yet
Convex Optimization Tutorial
179 pages
Optimality Conditions
No ratings yet
Optimality Conditions
10 pages
ConvexSpring25 Week3
No ratings yet
ConvexSpring25 Week3
30 pages
Lect5 Removed
No ratings yet
Lect5 Removed
35 pages
Convexity: A. Convex Sets and Functions
No ratings yet
Convexity: A. Convex Sets and Functions
37 pages
Convex Sets and Functions Guide
No ratings yet
Convex Sets and Functions Guide
42 pages
5 Optimization: F Emp
No ratings yet
5 Optimization: F Emp
52 pages
斯坦福大学机器学习数学基础 33-40
No ratings yet
斯坦福大学机器学习数学基础 33-40
8 pages
Convex Fns Scribed
No ratings yet
Convex Fns Scribed
6 pages
Convex - Module A Part 2
No ratings yet
Convex - Module A Part 2
27 pages
Lec 18
No ratings yet
Lec 18
6 pages
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
No ratings yet
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions
55 pages
QP-2 CS
No ratings yet
QP-2 CS
7 pages
03 Convex Functions Notes Cvxopt f22
No ratings yet
03 Convex Functions Notes Cvxopt f22
21 pages
A Detailed Analysis of The Brachistochrone Problem
No ratings yet
A Detailed Analysis of The Brachistochrone Problem
15 pages
Lecture3 ConvexSetsFuns PDF
No ratings yet
Lecture3 ConvexSetsFuns PDF
43 pages
CS 726: Nonlinear Optimization 1 Lecture 2: Background Material
No ratings yet
CS 726: Nonlinear Optimization 1 Lecture 2: Background Material
14 pages
Nonlinear Control Lecture Notes
No ratings yet
Nonlinear Control Lecture Notes
30 pages
Convexity I: Sets and Functions: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Convexity I: Sets and Functions: Ryan Tibshirani Convex Optimization 10-725
27 pages
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 1: Introduction To Optimization Problems and Mathematical Programming
No ratings yet
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 1: Introduction To Optimization Problems and Mathematical Programming
9 pages
IE643 Lecture10 Part2 25sep2020 PDF
No ratings yet
IE643 Lecture10 Part2 25sep2020 PDF
55 pages
Basic Concepts: 1.1 Continuity
No ratings yet
Basic Concepts: 1.1 Continuity
7 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Convexity II: Optimization Basics: Ryan Tibshirani Convex Optimization 10-725
28 pages
Convex Optimization for Students
No ratings yet
Convex Optimization for Students
12 pages
Lecture 5
No ratings yet
Lecture 5
25 pages
University of Maryland: Econ 600
No ratings yet
University of Maryland: Econ 600
22 pages
Convex Functions and Optimization
No ratings yet
Convex Functions and Optimization
20 pages
Convex Functions: September 2, 2008
No ratings yet
Convex Functions: September 2, 2008
21 pages
Closed Functions: September 4, 2007
No ratings yet
Closed Functions: September 4, 2007
19 pages
Heat Pump Performance Analysis
No ratings yet
Heat Pump Performance Analysis
2 pages
5G's Role in Smart City Growth
No ratings yet
5G's Role in Smart City Growth
4 pages
Snowflake Adapter For SAP Integration Suite
No ratings yet
Snowflake Adapter For SAP Integration Suite
41 pages
Practical File Questions
No ratings yet
Practical File Questions
2 pages
Bookwithindex
No ratings yet
Bookwithindex
96 pages
Ada Boost Optimizes Wave Energy Arrays
No ratings yet
Ada Boost Optimizes Wave Energy Arrays
6 pages
Macros For Mine Planning Engineer
No ratings yet
Macros For Mine Planning Engineer
8 pages
V30Plus GNSS RTK Brochure EN 20220608 S
100% (1)
V30Plus GNSS RTK Brochure EN 20220608 S
2 pages
Convex Optimization Prerequisite - Topics
No ratings yet
Convex Optimization Prerequisite - Topics
6 pages
Operation Manual Book Shapoli Eco 8
No ratings yet
Operation Manual Book Shapoli Eco 8
38 pages
Designing Input Filter
No ratings yet
Designing Input Filter
31 pages
CPSC 542f Notes
No ratings yet
CPSC 542f Notes
10 pages
03 Convex Functions
No ratings yet
03 Convex Functions
31 pages
Nandini - Gupta - Resume PDF
No ratings yet
Nandini - Gupta - Resume PDF
2 pages
Dynex Technologies DSX Troubleshooting Guide: Tip Pick-up/Eject Errors
No ratings yet
Dynex Technologies DSX Troubleshooting Guide: Tip Pick-up/Eject Errors
3 pages
Abtik Group
No ratings yet
Abtik Group
23 pages
Prime Coat
No ratings yet
Prime Coat
1 page
MT 101 Chapter 1
No ratings yet
MT 101 Chapter 1
5 pages
V1 N2 1980 Rabenhorst
No ratings yet
V1 N2 1980 Rabenhorst
6 pages
Acm Tosn
No ratings yet
Acm Tosn
22 pages
Baidu Summary
No ratings yet
Baidu Summary
4 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
2 pages
Edb Efm User
No ratings yet
Edb Efm User
115 pages
Lead - Security Operations and Monitoring JD
No ratings yet
Lead - Security Operations and Monitoring JD
2 pages
Fringe
No ratings yet
Fringe
38 pages
HMT: A Hardware-Centric Hybrid Bonsai Merkle Tree Algorithm For High-Performance Authentication
No ratings yet
HMT: A Hardware-Centric Hybrid Bonsai Merkle Tree Algorithm For High-Performance Authentication
28 pages
Brady Ferrule Printer
No ratings yet
Brady Ferrule Printer
5 pages
PUF Aging
No ratings yet
PUF Aging
26 pages
Selfstudys Com File
No ratings yet
Selfstudys Com File
5 pages
CHANDRA DZDA STAT6174037 ProbabilityTheoryandAppliedStatistics
No ratings yet
CHANDRA DZDA STAT6174037 ProbabilityTheoryandAppliedStatistics
17 pages
Woofer Tester Pro
No ratings yet
Woofer Tester Pro
16 pages
Towards Merkle Trees For High-Performance Data Systems: Muhammad El-Hindi Tobias Ziegler Carsten Binnig
No ratings yet
Towards Merkle Trees For High-Performance Data Systems: Muhammad El-Hindi Tobias Ziegler Carsten Binnig
6 pages
PA-PUF: A Novel Priority Arbiter PUF
No ratings yet
PA-PUF: A Novel Priority Arbiter PUF
6 pages
IoT Edge Blockchain: LiTiChain
No ratings yet
IoT Edge Blockchain: LiTiChain
15 pages
Fuzzy Extractor N
No ratings yet
Fuzzy Extractor N
15 pages
CPAKA Mutual Authentication and Key Agreement Scheme Based On Conditional PUF in Space-Air-Ground Integrated Network
No ratings yet
CPAKA Mutual Authentication and Key Agreement Scheme Based On Conditional PUF in Space-Air-Ground Integrated Network
14 pages
Jellyfish Merkle Tree: Abstract
No ratings yet
Jellyfish Merkle Tree: Abstract
12 pages
A Privacy-Aware Provably Secure Smart Card Authentication Protocol Based On Physically Unclonable Functions
No ratings yet
A Privacy-Aware Provably Secure Smart Card Authentication Protocol Based On Physically Unclonable Functions
13 pages
Cambridge 1 Syllabus Planer Nov - Dec 2023
No ratings yet
Cambridge 1 Syllabus Planer Nov - Dec 2023
3 pages
Payra Port Bridge Electrical Works
No ratings yet
Payra Port Bridge Electrical Works
4 pages
SAS1700-2015 - Creating Multi - Sheet Microsoft Excel Workbooks With SAS - Part 2
No ratings yet
SAS1700-2015 - Creating Multi - Sheet Microsoft Excel Workbooks With SAS - Part 2
21 pages
Effects of E-Learning On Intrinsic Motivation of Grade 11 Accountancy Business Management (ABM) Students in Agusan Sur National High School
No ratings yet
Effects of E-Learning On Intrinsic Motivation of Grade 11 Accountancy Business Management (ABM) Students in Agusan Sur National High School
4 pages
ThinkPad P Series
No ratings yet
ThinkPad P Series
14 pages
Lecture 2
No ratings yet
Lecture 2
37 pages
Economics Thesis Blue Variant
No ratings yet
Economics Thesis Blue Variant
38 pages

Lecture 1 2 Background

Uploaded by

Lecture 1 2 Background

Uploaded by

UW-Madison CS/ISyE/Math/Stat 726 Spring 2024

Lecture 1–2: Optimization Background

• x: a vector, optimization/decision variable

• f ( x ) objective function, real-valued

The (optimal) value of (P):

To fully specify (P), we need to specify

• vector space, feasible set, objective function;

• what it means to solve (P).

1.1 Can we even hope to solve an arbitrary optimization problem?

Can be formulated as a (continuous) optimization problem (PF ):

For any n ≥ 3, x n + yn = zn has no solutions over positive integers.

Proved by Andrew Wiles in 1994.

Example 2. Unconstrained optimization, many local minima:1

We cannot hope for solving an arbitrary optimization problem.

2 Specifying the optimization problem

• The norm tells us how to measure distances in Rd .

(Plots of unit balls of ℓ2 , ℓ1 , ℓ∞ norms.)

We will use ⟨·, ·⟩ to denote inner products. Standard inner product

∥z∥∗ := sup ⟨z, x ⟩ .

From the definition we immediately have the

2.2 Feasible set

Note that f i ( x ) ≥ C is equivalent to taking gi ( x ) = C − f i ( x ).

X = B2 (0, 1) = unit Euclidean ball

In this class, we will always assume that X is closed.

Hein-Borel Theorem: X ⊆ Rd is closed and bounded if and only if it is compact (if X ⊂ α∈ A Uα

Definition 2 (Convex set). A set X ⊆ Rd is convex if

We cannot hope to deal with arbitrary nonconvex constraints. E.g., xi (1 − xi ) = 0 ⇐⇒ xi ∈

2.3 Objective function

In the sequel, domain means effective domain.

“Linear and nonlinear optimization” ≈ “continuous optimization” (as contrast to discrete/combinatorial

2.3.1 Lower semicontinuous functions

f ( x ) ≤ lim inf f (y).

We way f is l.s.c. on Rd if it is l.s.c. at every point x ∈ Rd .

This definition is mainly useful for allowing indicator functions.

Using IX we can write

2.3.2 Continuous and smooth functions

2.3.3 Convex functions

f ((1 − α) x + αy) ≤ (1 − α) f ( x ) + α f (y).

Lemma 1. f : Rd → R is convex if and only its epigraph

Proof. Follows from definitions. Left as exercise.

Definition 6. We say that a function f : Rd → R̄ is proper if ∃ x ∈ Rd s.t. f ( x ) ∈ R.

Lemma 2. If f : Rd → R̄ is proper and convex, then dom( f ) is convex.

You might also like