Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
19 views16 pages

Short Introduction To Optimization MIT

The document provides an overview of optimization problems, emphasizing their importance in engineering design, data analysis, and business decisions. It discusses various types of optimization, including global vs. local optimization, convex vs. non-convex problems, and different algorithms used for solving them. Additionally, it covers specific optimization techniques such as linear programming, stochastic optimization, and the use of software tools for implementation.

Uploaded by

vikkie.vanscoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views16 pages

Short Introduction To Optimization MIT

The document provides an overview of optimization problems, emphasizing their importance in engineering design, data analysis, and business decisions. It discusses various types of optimization, including global vs. local optimization, convex vs. non-convex problems, and different algorithms used for solving them. Additionally, it covers specific optimization techniques such as linear programming, stochastic optimization, and the use of software tools for implementation.

Uploaded by

vikkie.vanscoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

A Brief Overview

of Optimization Problems

Steven G. Johnson
MIT course 18.335, Spring 2019

1
Why optimization?
• In some sense, all engineering design is
optimization: choosing design parameters to
improve some objective
• Much of data analysis is also optimization:
extracting some model parameters from data while
minimizing some error measure (e.g. fitting)
• Most business decisions = optimization: varying
some decision parameters to maximize profit (e.g.
investment portfolios, supply chains, etc.)

2
A general optimization problem
minimize an objective function f0
min' () (+) with respect to n design parameters x
$∈ℝ
(also called decision parameters, optimization variables, etc.)

— note that maximizing g(x)


corresponds to f0 (x) = –g(x)
subject to m constraints

(- + ≤ 0 note that an equality constraint


h(x) = 0
yields two inequality constraints
fi(x) = h(x) and fi+1(x) = –h(x)
(although, in practical algorithms, equality constraints
typically require special handling)
x is a feasible point if it
satisfies all the constraints
feasible region = set of all feasible x 3
Important considerations
• Global versus local optimization
• Convex vs. non-convex optimization
• Unconstrained or box-constrained optimization, and
other special-case constraints
• Special classes of functions (linear, etc.)
• Differentiable vs. non-differentiable functions
• Gradient-based vs. derivative-free algorithms
• …
• Zillions of different algorithms, usually restricted to
various special cases, each with strengths/weaknesses

4
Global vs. Local Optimization
• For general nonlinear functions, most algorithms only
guarantee a local optimum
– that is, a feasible xo such that f0(xo) ≤ f0(x) for all feasible x
within some neighborhood ||x–xo|| < R (for some small R)
• A much harder problem is to find a global optimum: the
minimum of f0 for all feasible x
– exponentially increasing difficulty with increasing n, practically
impossible to guarantee that you have found global minimum
without knowing some special property of f0
– many available algorithms, problem-dependent efficiencies
• not just genetic algorithms or simulated annealing (which are popular,
easy to implement, and thought-provoking, but usually very slow!)
• for example, non-random systematic search algorithms (e.g. DIRECT),
partially randomized searches (e.g. CRS2), repeated local searches from
different starting points (“multistart” algorithms, e.g. MLSL), …
5
Convex Optimization
[ good reference: Convex Optimization by Boyd and Vandenberghe,
free online at www.stanford.edu/~boyd/cvxbook ]

All the functions fi (i=0…m) are convex:


where

convex: f(x) not convex: f(x)

bf ( y )
a f(x) +
f(ax+by)
x y x y
For a convex problem (convex objective & constraints)
any local optimum must be a global optimum
Þ efficient, robust solution methods available
6
Important Convex Problems
• LP (linear programming): the objective and
constraints are affine: fi(x) = aiTx + ai
• QP (quadratic programming): affine constraints +
convexquadratic objective xTAx+bTx
• SOCP (second-order cone program): LP + cone
constraints ||Ax+b||2 ≤ aTx + a
• SDP (semidefinite programming): constraints are that
SAkxk is positive-semidefinite
all of these have very efficient, specialized solution methods
7
Important special constraints
• Simplest case is the unconstrained optimization
problem: m=0
– e.g., line-search methods like steepest-descent,
nonlinear conjugate gradients, Newton methods …
• Next-simplest are box constraints (also called
bound constraints): xkmin ≤ xk ≤ xkmax
– easily incorporated into line-search methods and many
other algorithms
– many algorithms/software only handle box constraints
• …
• Linear equality constraints Ax=b
– for example, can be explicitly eliminated from the
problem by writing x=Ny+x, where x is a solution to
Ax=b and N is a basis for the nullspace of A
8
Derivatives of fi
• Most-efficient algorithms typically require user to
supply the gradients Ñxfi of objective/constraints
– you should always compute these analytically
• rather than use finite-difference approximations, better to just
use a derivative-free optimization algorithm
• in principle, one can always compute Ñxfi with about the same
cost as fi, using adjoint methods
– gradient-based methods can find (local) optima of
problems with millions of design parameters
• Derivative-free methods: only require fi values
– easier to use, can work with complicated “black-box”
functions where computing gradients is inconvenient
– may be only possibility for nondifferentiable problems
– need > n function evaluations, bad for large n
9
Removable non-differentiability
consider the non-differentiable unconstrained problem:

min' |)* + | f0(x)


$∈ℝ –f0(x)
optimum
equivalent to minimax problem:
x
min' max{)* + , −)* (+)}
$∈ℝ
…still nondifferentiable…

…equivalent to constrained problem with a “temporary” variable t:


l e !
a b 5 ≥ )* (+)
e n t i
min 5 subject to: i.e. )7 +, 5 = )* + − 5
e r 5 ≥ −)* (+) )9 +, 5 = −)* + − 5
di ff $∈ℝ ,4∈ℝ '

10
Example: Chebyshev linear fitting
fit line
find the fit that minimizes b ax1+x2
the maximum error:
min max +, -* + +/ − 1*
$%,$' * N points
= min' 5+ − 1 6 (ai,bi)
$∈ℝ
… nondifferentiable minimax problem
a

equivalent to a linear programming problem (LP):


subject to 2N constraints:
min 8 8 ≥ +, -* + +/ − 1* equivalently:
8 ≥ +, -* + +/ − 1*
$% ,$' ,7 8 ≥ −+, -* − +/ + 1*

11
Relaxations of Integer Programming
If x is integer-valued rather than real-valued (e.g. x Î {0,1}n),
the resulting integer programming or combinatorial optimization
problem becomes much harder in general.

However, useful results can often be obtained by a continuous


relaxation of the problem — e.g., going from x Î {0,1}n to x Î [0,1]n
… at the very least, this gives an lower bound on the optimum f0

“Penalty terms” or “projection filters” (SIMP, RAMP, etc.)


can be used to obtain x that ≈ 0 or ≈ 1 almost everywhere.

[ See e.g. Sigmund & Maute, “Topology optimization approaches,” Struct.


Multidisc. Opt. 48, pp. 1031–1055 (2013). ]

12
Example: Topology Optimization
design a structure to do something, made of material A or B…
let every pixel of discretized structure vary continuously from A to B
[ + tricks to impose minimum feature size and mostly “binary” A/B ]

density of each pixel


varies continuously from 0 (air) to max
force

ex: design a cantilever


optimized structure,
to support maximum weight
deformed under load
with a fixed amount of material © Springer Nature Switzerland AG. All rights reserved.
This content is excluded from our Creative Commons
license. For more information, see https://ocw.mit.edu/
help/faq-fair-use.
13 [ Buhl et al, Struct. Multidisc. Optim. 19, 93–104 (2000) ]
Stochastic Optimization
where ([⋯ ] is
min' ( )(+, -) expected value
$∈ℝ averaging over
random vars -

Deep-learning example:
Fitting (“learning”) to a huge “training set”
by sampling a random subset -:
) +, - = ∑5∈6 )5 +

∇$ ) often exists, but typically can’t use standard


gradient-descent because of randomness.

A popular algorithm: Adam [Kingma & Ba, 2014]


“stochastic gradient descent”
14
Some Sources of Software
• NLopt: implements many nonlinear optimization algorithms
callable from many languages (C, Python, R, Matlab, …)
(global/local, constrained/unconstrained, derivative/no-derivative)
http://github.com/stevengj/nlopt

• Python: scipy.optimize, pyOpt, …; Julia: JuMP, Optim,…

• Decision tree for optimization software:


http://plato.asu.edu/guide.html
— lists many (somewhat older) packages for many problems

• CVX: general convex-optimization package http://cvxr.com


… also Python CVXOPT, R CVXR, Julia Convex
15
MIT OpenCourseWare
https://ocw.mit.edu

18.335J Introduction to Numerical Methods


Spring 2019

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

You might also like