Introduction to
Genetic Algorithms
Max Peysakhov
About Genetic Algorithms
In the 1950s and 1960s computer scientists started to study
the evolutionary optimization systems. Genetic Algorithms
were one of them.
Genetic Algorithms (GA) operate on the set (population) of
candidate solutions (individuals are also called
chromosomes) .
Chromosomes are the strings or arrays of genes (a gene is
the smallest building block of the solution).
Each iteration of the search is called a generation.
The idea was to evolve a solution of high quality from the
population of candidate solutions.
Architecture of the Genetic Algorithm
Data Best solution found so fare
Initialization Initial
Population
Evaluate Evaluated
Next
Chromosomes Population
Generation Select
Chromosomes
Mutate
Candidate
Chromosomes Next
Candidate Crossover Generation
Next Chromosomes
Generation
Illustration of the Genetic Algorithm
Evaluation Function
Evaluation Function
Initial Population Final Population
Individual
Best individual solution
Representation and Initialization
You begin with a population of random bit strings.
Each bit string encodes some problem configuration.
Example: You can encode SAT problem by representing each
Boolean variable by a position in the bit string
(a0 a1 a2 ) (a0 a2 a3 ) (a0 a2 a3 ) (a2 a3 a4 ) (a0 a2 a4 )
}
1 0 1 1 0
0 1 1 0 0
1 1 1 1 0 N randomly generated
individuals (initial population)
0 0 1 0 1
1 0 1 0 1
Some Terminology
Chromosome, Individual, Genotype Configuration of the
Population member bits in the string.
}
1 0 1 0 1
1 0 1 1 0 Phenotype What genotype
represents.
0 1 1 0 0
a0 = T, a1 = F, a2 = T, a3 = F, a4 =T
1 1 1 1 0
Fitness Landscape graph of
0 0 1 0 1
Population the fitness function
1 0 1 0 1
Genes
Fitness Function
In Nature: THE FITTEST SURVIVE
In GA we have to find out how to measure
the fitness .
Fitness function should guide algorithm to
the optimal solution.
Example: For SAT problem we can use the
number of satisfied clauses as fitness
function.
Selection: Fitness Proportionate
Ones we evaluated the individuals how do
we choose ones that survive?
Each individual gets a chunk of a roulette wheel
proportionate to its fitness relative to the fitness of others .
Spin the roulette wheel N times.
Fitness Proprtional Selection
Chr 1
Chr 2
Chr 3
Chr 4
Chr 5
Chr 6
Chr 7
Chr 8
Chr 9
Chr 10
Selection: Universal Stochastic Sampling
Than using Fitness Proportional Selection it is possible
(although unlikely) that all selected individuals will have
lowest fitness.
To fix that problem USS suggests to spin N-pointer roulette
wheel only once.
Universal Stochastic Sam pling
Chr 1
Chr 2
Chr 3
Chr 4
Chr 5
Chr 6
Chr 7
Chr 8
Chr 9
Chr 10
Selection: Sigma Scaling
Some individuals may be much more fit then others. Their
offspring will dominate the population and can cause a
convergence on a local maximum.
Sigma Scaling is designed to keep a selections pressure
relatively constant over the run of GA.
Exp.Val = iif (Stddev = 0, 1, 1+(f(x) Mean(f)) / 2*Stddev)
Sigm a Scaling Selection
Chr 1
Chr 2
Chr 3
Chr 4
Chr 5
Chr 6
Chr 7
Chr 8
Chr 9
Chr 10
Selection: Rank proportional
Same purpose as Sigma scaling.
All individuals in the population are ranked in order of their fitness.
By adjusting Max desirable constant selection pressure can be
achieved
Min + Max = 2 and 1 < Max < 2
Exp.Val = Min + (Max-Min) *((Rank (I) 1) / (N-1)).
Rank Proportional Selection
Chr 1
Chr 2
Chr 3
Chr 4
Chr 5
Chr 6
Chr 7
Chr 8
Chr 9
Chr 10
Crossover: Single Point Crossover
Two parents selected at random.
Single crossover point selected at random.
Chromosomes are cut at the crossover point.
Tail part of each chromosome spliced with head part of the
other chromosome.
Parent 1 Offspring 1
1 1 1 1 0 1 1 1 0 1
Crossover point
0 0 1 0 1 0 0 1 1 0
Parent 2 Offspring 2
Crossover: K-Point Crossover
Two parents selected at random.
k crossover points selected at random.
Chromosomes are cut at the crossover points.
Appropriate sections are swapped.
Parent 1 Offspring 1
1 1 1 1 0 1 0 1 1 1
Crossover points
0 0 1 0 1 0 1 1 0 0
Parent 2 Offspring 2
Crossover: Uniform Crossover
Two parents selected at random.
Iterate through chromosomes.
With probability pc swap individual genes.
Parent 1 Offspring 1
1 1 1 1 0 0 1 1 1 1
Crossover points
0 0 1 0 1 1 0 1 0 0
Parent 2 Offspring 2
Mutation
Iterate through chromosomes.
With probability pc alter individual genes.
1 1 1 1 0
Mutated gene
1 1 0 1 0
Elitism
Trick that works.
Select small number of the most fit individuals.
Copy them to the new generation unchanged.
Preserve valuable members of the population
Current Generation Next Generation
1 0 1 1 0
0 1 1 0 0 } Elite
1 0 1 1 0
0 1 1 0 0 } Elite
} }
1 1 1 1 0 1 0 1 0 1
1 0 1 1 0
0 0 1 0 1 Rest of the Rest of the
population population
1 0 1 0 1 1 0 1 1 1
When to Stop?
When you found an optimal solution.
When you completed the predefined number of
generations.
When time limit expired.
When population converges to a single individual.
When fitness function converges to a single value.
After some number of generations with no
improvement.
Any other ideas?
Usually use combination of 2 or 3 stopping criteria.
Simple GA
1. Let P = a random population of N individuals.
2. While Stopping criteria is false do:
3. Let fi = Fitness(Pi) for Pi P
4. Let P = Select_New_Population(P,f)
5. Let P = Crossover(P)
6. Let P = Mutate(P)
7. Let P = P
Back to Our Example
Lets simulate one iteration of GA with population size 4
(a0 a1 a2 ) (a0 a2 a3 ) (a0 a2 a3 ) (a2 a3 a4 ) (a0 a2 a4 )
Parameter Tuning
We defined large number of control parameters.
Crossover rate
Mutation rate
Stopping Criteria
Population Size.
Any others?
How do we set these?
Any ideas?
Wide open research area.
No general set of rules or guidelines .
Most often used method is trial-and-error.
Parameter Tuning (2)
Meta-level optimization.
Employing a second GA to optimize a parameters of
the firs one.
Fitness evaluation is expensive.
Difficult to identify fitness.
Adapting control parameters.
Adapting control parameters over time.
Using problem related feedback to adjust parameters.
Messy Genetic Algorithms
Length of the chromosome may change during the
execution.
Achieved by using messy crossover operators.
Cut - cuts chromosome in arbitrary point.
Splice - joins two parts of the chromosome together.
Used then length of the solution is unknown.
Parent 1 Offspring 1
1 1 1 1 0 1 1 1 1 0 1
Cut points
0 0 1 0 1 0 0 1 0
Parent 2 Offspring 2
Genetic Programming (GP)
Formulated by Koza in 1990 as means of automatic
programming
Similar to GA but:
Chromosomes are encoded as pars trees typically
representing a computer program in Lisp
IF
(if (> x y) x y)
> X Y
X Y
GP: Initialization
Define your function set C
{+, -, *, /, ABS, if, }
Define your terminal states T
{0, 1, 2, , x, y, z, , Pi, e, }
For each population member
1. Start with empty parse tree P
2. Add Xi to P such that Xi C or Xi T
3. Repeat procedure recursively for all children of Xi
GP: Crossover
Two parents selected at random.
Random nodes independently picked in each parent.
Swap sub-tries originated at these nodes.
IF IF
> X Y > +
X
X Y Crossover X Y X Y
(if (> x y) x y) points (if (> x y) x (+ x y))
*
*
+ Y
Y Y
X Y
(* (+ x y) y) (* y y)
GP: Mutation
One chromosome selected at random.
Random node picked.
Remove sub-tree rooted at that node.
Grow a random sub-tree out of this node
IF IF
> X Y > - Y
Mutated
X Y X Y X Y
gene
(if (> x y) x y) (if (> x y) (- x y) y)
Anytime Algorithms
Algorithms capable of trading off execution time
for solution quality called anytime algorithms.
GA are naturally anytime algorithms.
120
100
Solution quality
80 Regular
60 Algorithm
40 Anytime
20 Algorithm
0
-20
Time
Knapsack Problem
Definition: Given items of different values and
weights, find the most valuable set of items which
fit in a knapsack of fixed weight limit.
Classic NP hard, combinatorial optimization
problem with a wide range of applications.
Conventional search techniques may take too long
to produce a result of any value.
Lets set up a GA to solve this problem.
Knapsack Problem: Choosing an
Algorithm.
Choices:
Genetic Algorithm
Messy Genetic Algorithm
Genetic Programming
Knapsack Problem: Choosing a Representation.
Answer: Messy Genetic algorithms, because
the solution length is unknown up front.
Given a set (array) of items S0 to SN Find a suitable
representation of the chromosome
S 0 S 1 S 2 S 3 S4 S5 S 6 S 7 S8 SN
P(s) 64 59 35 27 31 34 43 63 62 15
W(s) 13 75 32 53 83 5 7 37 41 48
Knapsack Problem: Choosing Crossover Operators.
Answer: Genotype - 0 1 5 6 2 8
Phenotype - items S0 , S1 , S5 , S6 , S2 , S8
with total weight - 326
with total value - 142
Choose a crossover operators:
Single point
K-point, whats the k
Uniform
Knapsack Problem: Choosing Mutation Operators.
Answer: Got you! We are using messy GA, so we have
to use messy crossover (cut and splice) operators.
Create a mutation operator.
Knapsack Problem: Choosing Mutation Operators.
Answer: thats easy. Simply replace mutated gene with a
random number from 1 to N
Lets see how it works:
0 1 5 6 2 8 5 1 5 6 4 7 10
Cut points Mutated gene
5 3 2 4 7 10 5 3 2 2 8 Oops!
Can we fix it
Knapsack Problem: Choosing Evaluation Function.
Answer: Sure. Delete all duplicate entries.
5 1 5 6 4 7 10 5 1 6 4 7 10
5 3 2 2 8 5 3 2 8
Lets think about evaluation function
Knapsack Problem: Choosing Evaluation Function. (2)
Answer: The solution worth the sum of its items prices,
if it is within weight limits, otherwise it worth 0
iif ( Sum(W(Si)) > Wkn, 0, 1) * Sum(P((Si))
That means that all the solutions with access weight
treated by GA in the same way.
Shouldnt we assign more utility to the lighter
solutions than to the heavier ones?
Can we improve the evaluation function.
Knapsack Problem: Choosing Selection Strategy
Answer: The solution worth the sum of its items prices, if it is
within weight limits, otherwise its fitness inverse proportionate
to its weight.
iif ( Sum(W(Si)) > Wkn, 1/Sum(W((Si)), Sum(P((Si)))
Choosing selection strategy
Fitness proportional
USS
Sigma Scaling
Rank
Should we perform Elitism
Knapsack Problem: Parameter Tuning
Answer: How do I know?
Elitism usually improves performance.
Although the Fitness Proportionate selection is often the
worst, only experiments will show the best selection strategy
Parameters like crossover and mutation rates, population size
and others have to be tuned after the problem is implemented in
code.
I would start with:
Crossover rate 0.75
Mutation rate 0.01
Stopping Criteria 1000 Generations
Population Size 100
GA Issues
Choice of representation is critical.
Choice of generic operators is critical.
Design of fitness function is critical.
Parameter tuning is critical.
There is no template for doing it correctly from the
first attempt.
GA Discussions
Often a second best way to solve a problem.
Easy to implement
GA is uninformed search
Can be improved by using heuristic operators
Evolving Virtual Creatures by Karl Sims
Creating virtual creatures that move in simulated
three-dimensional physical worlds.
The morphologies of creatures and the neural
systems for controlling their muscles are both
generated using genetic algorithms.
Different fitness evaluation functions are used to
direct simulated evolutions towards specific
behaviors such as swimming, walking, jumping,
and following.
More info at: http://www.genarts.com/karl/
Evolving Virtual Swimmers by Karl Sims
Simple paddling and
tail wagging
creatures.
Some used
symmetrical flippers
to propel themselves.
Water snake creatures
evolved that wind
through the water.
Creatures evolved for swimming.
Evolving Virtual Walkers and Jumpers by Karl Sims
Simple creatures that could
shuffle or hobble along
Some simply wag an
appendage in the air to rock
back and forth.
More complex creatures push
or pull themselves along.
A number of simple jumping
creatures did emerge.
Creatures evolved for walking.
Creatures evolved for jumping.
Evolving Virtual Creatures The Movie by Karl Sims
The Golem Project
Automatic Design and Manufacture of Robotic Lifeforms
Hod Lipson and Jordan Pollack
Golem project (Genetically Organized Lifelike Electro Mechanics)
Simple electro-mechanical systems evolved to yield physical
locomoting machines.
The first time robots have been robotically designed and robotically
fabricated.
Golem Project Evolved Creatures
Hod Lipson and Jordan Pollack
Arrow Ratchet
Tetra Snake
Golem Project Arrow Creature
Hod Lipson and Jordan Pollack
Real Arrow
Simulated Arrow
Golem Project Tetra Creature
Hod Lipson and Jordan Pollack
Real Tetra
Simulated Tetra
Project Objectives
Apply GA to Lego assembly generation.
Represent Lego assemblies precisely and
unambiguously.
Encode assemblies as a chromosomes.
Adapt genetic operators.
Perform Genetic optimization on Lego structures.
System Overview
Our system was extended from sGA originally created by Prof.
Stephen Hartley in 1996 and written on the Java programming
language.
Java3D package and VRML97 were used in order to create a visualizer
to monitor Lego structures as they evolve.
The system supports:
One-point crossover
Proportional, rank, universal stochastic sampling, sigma scaling,
and Boltzman selection techniques
Elitism
And allows input of the mutation and crossover rates and the
population size.
Examples: 10 by 10 by 10 Light Structure
In this case the goal was to evolve a structure with the size of 10
Lego units in each x-y-z dimension with the minimal weight.
Left structure was created at 895 generation and has sizes 10 by 10 by
6.8
Right structure was created at 3367 generation and has sizes 10 by 10
by 10.
Both structures among the lightest possible structures that satisfy
these parameters that can be created from the set of elements
given.
Examples: Pillar-Like Dense Structure
In this case the goal was to
evolve a dense pillar-like
structure with the 2 by 4
base and 20, 40 and 60
height.
Structures shown where
evolved in 5000 generations
on average.
All of them exactly match
desired size and among
densest possible structures.
Height 20, 40 and 60 from right to left
Examples: Staircase Structure
In this case the goal was to
creating staircases of 3, 4, 5, 6,
7 and 10 steps .
Set of building blocks
available reduced to 2 by 1
plates only.
7 step staircase was evolved
in 51573 generations.
10 step staircase was evolved
in 568 generations.
All of them exactly match
desired size.
Staircase of 7 steps (top) 10 steps (bottom)