Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views24 pages

Lecture 7

The document discusses various unconstrained optimization algorithms, including single-variable and multivariate methods, emphasizing their efficiency, ability to handle non-smooth problems, and global optimization capabilities. It covers techniques such as random methods, cyclic coordinate search, Nelder-Mead simplex method, and biologically inspired algorithms like genetic algorithms and particle swarm optimization. The document also highlights the importance of performance comparison and practical examples to locate optima efficiently.

Uploaded by

Justas Papucka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views24 pages

Lecture 7

The document discusses various unconstrained optimization algorithms, including single-variable and multivariate methods, emphasizing their efficiency, ability to handle non-smooth problems, and global optimization capabilities. It covers techniques such as random methods, cyclic coordinate search, Nelder-Mead simplex method, and biologically inspired algorithms like genetic algorithms and particle swarm optimization. The document also highlights the importance of performance comparison and practical examples to locate optima efficiently.

Uploaded by

Justas Papucka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Engineering Optimization

Concepts and Applications

Lise Noël
Matthijs Langelaar
3mE-PME 34-G-1-300
[email protected]
[email protected]
1ME46060

Unconstrained optimization
algorithms
● Single-variable methods

● Multiple variable methods (‘multivariate optimization’)

– 0th order

– 1st order

– 2nd order

1
Recap optimization algorithms
● Aspects to consider:

– Efficiency (speed of convergence, computational effort,


scaling with nr. of variables)
– Use of derivatives

– Ability to handle non-smooth problems

– Ability to find global optima

– Termination criteria

● Exhaustive approaches (brute force) generally not


feasible

Summary single variable methods


● Bracketing +

 Dichotomous sectioning

 Fibonacci sectioning
0th order
 Golden ratio sectioning

 Quadratic interpolation In practice:


additional “tricks”
needed to deal
 Cubic interpolation with:
 Bisection method 1st order  Multimodality
 Strong
 Secant method fluctuations
 Round-off
errors
 Newton method 2nd order  Divergence

● And many more!

2
Unconstrained optimization
algorithms
● Single-variable methods min f (x)
x
● Multiple variable methods xxx
– 0th order Direct search methods
– 1st order Descent methods
or
Hill-climbing methods
or
Gradient-based methods
– 2nd order

Contents
● General aspects

● Direct search methods:

– Random methods

– Cyclic coordinate search / Powell’s conjugate directions

– Nelder-Mead simplex method

– Biologically inspired methods

 Genetic algorithms

 Particle swarm / ant colony

● First order methods


6

3
Algorithm performance
● Comparison of performance of algorithms:

– Mathematical convergence proofs

– Performance on benchmark problems (test functions)

● Examples of test functions:

– Rosenbrock’s function (“banana function”)


f  100 x2  x1   1  x 
2 2
1
2

Optimum: (1, 1)

Test functions
● Quadratic function:

f  ( x1  2 x2  7) 2  (2 x1  x2  5) 2
Optimum: (1, 3)

● Many local optima:


2 2
f  50 cos x1 cos x2  x1  x2
Optimum: (0, 0)

● And many others …

4
Please note:
● In this lecture: ● In reality:
isolines of objective underlying function
function shown for unknown!
illustration purposes

d
? ?
 Many ?  Optimizer
function
? ? must
evaluations d? evaluate
are needed ? ? f (x) at
to create ?d individual
this plot! designs

Practical example
● Try your own approach!

● 2 design variables

● Single, global optimum

● Try to locate minimum in


least number of
evaluations
(evaluations are
expensive)

10

5
Contents
● General aspects

● Direct search methods:

– Random methods

– Cyclic coordinate search / Powell’s conjugate directions

– Nelder-Mead simplex method

– Biologically inspired methods

 Genetic algorithms

 Particle swarm / ant colony

● First order methods


11

Random (stochastic) methods


● Random jumping method:
(random search)
– Generate random points,
remember the best

● Random walk method:

– Generate random unit direction


vectors
– “Walk” to new point if better
– Decrease stepsize after N steps

12

6
Simulated Annealing
● Random method inspired by physical process: annealing

= Heating and gradual cooling


of metal/glass to relieve
internal stresses

– End result: minimum internal energy

– Temperature-dependent probability
of local internal energy change
Local
internal
– Some chance on local energy energy
increase exists!
Time
13

Simulated Annealing Algorithm


1. Set a starting “temperature” T, pick a starting design x,
and obtain f(x)
2. Randomly generate a new design y
close to x, and obtain f(y)
3. If f(y) < f(x), accept and Continue 4.
Otherwise:
1. Compute probability P(T) of accepting worse design

2. Use random number to accept design y or not. Continue 4.


4. Reduce temperature T. Continue 2. (Until convergence)

14

7
Simulated Annealing Algorithm (2)
● Probability of accepting ‘bad’ step depends on T,
as in physical annealing: f ( x )  f ( y ) <0
 
 
Paccept f ( y ) f ( x )  e  T 

● Test: generate random number r, accept if r < P.

● As temperature reduces, probability of accepting a


bad step reduces as well:
 f ( x ) f ( y ) 
P  
 f ( x ) f ( y )  e T 
 
r P  e  T  1 f ( x ) f ( y )
T
Increasingly negative
15

Simulated Annealing Properties


● Accepting bad steps (“energy increase”) likely in initial
phase, but less likely at the end

T = 0: basic random walk method


 SA can escape local optima,
especially at the start

● Variants: several steps before test,


cooling schemes,
reheating, …
Matlab: simulannealbnd

16

8
Random methods properties
● Very robust: work also for discontinuous /
nondifferentiable functions

● Can find global minimum (unknown when)

● Quite inefficient, but can be used in


initial stage to determine promising
starting point

● Last resort: when all else fails

● S.A. known to perform well on several


hard problems (“traveling salesman”)

● Drawback: results not repeatable


(unless you initialize random number generator with fixed settings)
17

Contents
● General aspects

● Direct search methods:

– Random methods

– Cyclic coordinate search / Powell’s conjugate directions

– Nelder-Mead simplex method

– Biologically inspired methods

 Genetic algorithms

 Particle swarm / ant colony

● First order methods


18

9
Cyclic coordinate search
● Search alternatingly in each coordinate direction (design variable)

● Perform single-variable optimization along each direction s


(line search, = partial minimization):

min f (x  s)

● Directions fixed: can lead


to slow convergence

19

Powell’s Conjugate Directions method


● Adjusting search directions improves convergence

● Idea: after a full cycle, also search in combined direction,


and add it to the direction set while removing the first one:
 s1   s2 
   
 s2  s3 
 s   (1)s   (1)s   s   (2)s   (2)s 
 3 1 1 2 2  4 2 2 3 3

Directions in cycle 1 Directions in cycle 2

● Guaranteed to converge in
1s1   2s 2
n cycles for quadratic
functions! (theoretically)  2s 2
1s1
(= n(n+1) line searches)
20

10
Nelder-Mead Simplex method
● Simplex: figure of n + 1 points in Rn
● Gradually move toward minimum by
reflecting worst point through centroid of
other points: f = 10

f=5 f=7

● For better performance:


expansion/contraction and
other adjustments
21

Nelder-Mead Simplex in action

f ( x1 , x2 )  ( x12  x2  7) 2
 ( x1  x22  11) 2

Matlab: see fminsearch


22

11
Contents
● General aspects

● Direct search methods:

– Random methods

– Cyclic coordinate search / Powell’s conjugate directions

– Nelder-Mead simplex method

– Biologically inspired methods

 Genetic algorithms

 Particle swarm / ant colony

● First order methods


23

Biologically inspired methods


● Popular: inspiration for algorithms
from biological processes:
– Genetic algorithms / evolutionary optimization

– Particle swarms / flocks

– Ant colony methods

● Typically make use of population (collection of designs)

● Computationally intensive

● Stochastic nature, global optimization properties


24

12
Genetic algorithms (GA)
● Based on evolution theory of Darwin:
Survival of the fittest

● Objective = fitness function

● Designs are encoded in chromosomal


strings, ~ genes: e.g. binary strings:

1 1 0 1 0 0 1 0 1 1 0 0 1 0 1

Can also include


x1 x2 discrete variables!
25

GA flowchart
Create initial
population Evaluate fitness
of all individuals

Create new population Test termination


criteria
Crossover Mutation Reproduction

Select individuals
for reproduction Quit

26

13
GA population operators
● Reproduction:

– Exact copy/copies of individual

● Mutation:

– Randomly flip some bits of a gene string

– Used sparingly, but important to explore new designs

1 1 0 1 0 0 1 0 1 1 0 0 1 0 1

1 1 0 1 0 1 1 0 1 1 0 0 1 0 1

27

GA population operators (2)


● Crossover:

– Randomly exchange genes of


different parents
– Many possibilities: how many
genes, parents, children …
Parent 1 Parent 2
1 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 0 1

0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 0 0 1

Child 1 Child 2
28

14
Genetic Algorithm properties
● Random:

– Very robust: work also for


discontinuous / nondifferentiable
functions
– Can find global minimum (but unknown when)

● Many different variations / strategies / parameters, not


easy to determine best settings

● Computationally intensive (population  generations)

● Population: set of results Matlab: see ga


29

Particle swarms / flocks

● No genes and reproduction, but a population that


travels through the design space

● Derived from simulations of flocks/schools in nature

● Individuals tend to follow the individual with the best


fitness value, but also determine their own path

● Some randomness added to give explo-


ration properties (“craziness parameter”)
Matlab: Ant colony: MIDACO*
30 *http://www.midaco-solver.com/

15
PSO (Particle Swarm Optimization)
Example:

http://www.itm.uni-stuttgart.de/research/pso_opt

31

Basic particle swarm Matlab: see


particleswarm
optimization algorithm
1. Initialize location x0 and speed v0 of individuals (random)

2. Evaluate fitness (=objective) for each individual

3. Update best positions: individual (y) and overall (Y)

4. Update velocity and position:

vi 1  vi  c1r1   yi  xi   c2 r2  Yi  xi 
x i 1  x i  v i 1
Control “social behavior” Random number vectors
vs “individual behavior” between 0 and 1

32

16
Overview 0th order methods
● Random:

– Jumping, Walk, Simulated Annealing

– Biological inspired: GA, particle swarm / ant colony

● Cyclic coordinate search (series of 1D optimizations)

● Powell’s conjugate directions

● Nelder-Mead Simplex

(for ~smooth problems)

33

Summary 0th order methods


● Nelder-Mead beats Powell in most cases

● Robust: most can deal with discontinuity etc.

● Less attractive for many design variables (>10)

● Stochastic techniques:

– Computationally expensive, but

– Global optimization properties

– Versatile

● Population-based algorithms are easy to


combine with parallel computing

34

17
Unconstrained optimization
algorithms
● Single-variable methods

● Multiple variable methods

– 0th order

– 1st order

– 2nd order

35

Practical example 2
● Now also gradient
information available in
each point

● Again, try to locate the


optimum in the least
number of steps!

36

18
Steepest descent method
● Move in direction of largest decrease in f :
f (x  hs)  f (x)  f  sh  o(h 2 )
T
Taylor:

 df  f (x  hs)  f (x)  f  sh


T

Best direction: s  f x2 f = 1.9


-f
● Example: 4 2
f  x1  2 x1 x2  x2
2
f = 0.044

 4 x13  4 x1 x2   2 x1 
f    2
  2( x1  x 2 )  -f

 1 
2
  2 x1  2 x2 
f = 7.2
Divergence occurs! Remedy: line search x1
37

Steepest Descent algorithm

1. Start with abritrary x1

2. Set first search direction: d1  f1


3. Line search to find next point: x i 1  x i   i d i
4. Update search direction: di 1  f i 1
5. Repeat 3

38

19
Steepest Descent method (2)
● With line search: Line search
f = const > f
min f (x  f )

– Gradient is perpendi-
f = const
cular to isoline f
– Line search direction is Along isoline:
tangent to isoline at line Directional derivative = 0:
f (x  ht )  f (x)
search minimum  f  t  0
T

h
– New gradient after line
 f  t
search is perpendicular
f t
to previous direction

39

Steepest Descent algorithm

1. Start with abritrary x1

2. Set first search direction: d1  f1


3. Line search to find next point: x i 1  x i   i d i
4. Update search direction: di 1  f i 1
 di 1  di 
5. Repeat 3

40

20
Steepest descent convergence
● Zig-zag
convergence
behavior:

(each step  to
previous one)

41

Effect of variable scaling


on Steepest Descent
● Scaling of variables
helps a lot! yx2

2 2
f  x1  16x2

y1  x1 , y2  4 x2 
2 2
f  y1  y2

● Ideal scaling hard to


determine (requires
Hessian information) yx1

42

21
Fletcher-Reeves
Conjugate Gradient method (CG)
● Based on building set of N conjugate directions,
combined with line searches Symmetric, positive definite

● Conjugate directions:
d i Ad j  0 | i  j
T

(Examples: orthogonal directions, eigenvectors)

● Conjugate gradient method:

– Matrix A not needed, set of directions d constructed during process

– Guaranteed convergence:
minimizes quadratic problems in N line search steps
(recall Powell’s Conjugate Directions: N cycles of N+1 line searches)
43

Conjugate Gradient algorithm


1. Start with abritrary x1

2. Set first search direction: d1  f1


3. Line search to find next point: x i 1  x i   i d i
2
f i 1
4. Next search direction: d i 1  fi 1  2
di
f i
5. Repeat 3

6. Restart every (n+1) steps, using step 2

(proofs & underlying math omitted)

44

22
CG properties
● Theoretically converges in N steps
or less for quadratic functions

● No zig-zag like in steepest descent

● In practice:

– Non-quadratic functions

– Finite line search accuracy Slower convergence;


> N steps
– Round-off errors
● After N steps / bad convergence: restart procedure
d N 1  f N 1 etc.
45

First order Multivariate Unconstrained


Optimization Algorithms - Summary
● First order methods (descent methods):

– Steepest descent method (with line search)

– Fletcher-Reeves Conjugate Gradient method

– Quasi-Newton methods (next lecture) y2


● Conclusions (for now):

– Scaling important for Steepest Descent (zig-zag)


y1
– For quadratic problem, CG converges in N steps

● More in next lecture

46

23
Final remarks
● Papalambros: par. 7.1, 7.2

● Exercise 4.1 + 4.2: unconstrained optimization,


steepest descent (4.3 later)
– Hand-in Exercise 3 & 4: Friday May 20, 11pm, via BS
(keep it brief,
no extensive reports needed)
– Pairs: please both submit

47

24

You might also like