GENETIC ALGORITHMS
Spring 2024 CS550214 Big Data Analytics
Credits
1. B1: Machine learning: an algorithmic perspective. 2nd Edition,
Marsland, Stephen. CRC press, 2015
2. B2: Principles of Soft Computing. 3rd Edition. S. N. Sivanandam, S. N.
Deepa. Wiley, 2018.
3. https://www.tutorialspoint.com/genetic_algorithms/genetic_algorith
ms_parent_selection.htm
Assignment
Read:
B1: Chapter 10 (Till 10.3)
Problems:
B1:
Evolution as a search problem
Competing animals and “Survival of the fittest”
“Fittest” animals
◼ Livelonger
◼ Stronger
◼ More attractive
Hence, they get more mates and produce more and “healthier” off springs
Nature is biased towards
“fitter” animals for
reproduction
Basisfor Genetic
Algorithms
Parent chromosomes are
copied randomly to the
child
Butcopy errors can
happen: Mutation
Genetic Algorithm (GA)
Modelling a problem as a GA
A method for representing solutions as chromosomes (or
string of characters)
A way to calculate the fitness of a solution
One generation
A selection method to choose parents
A way to generate offspring by breeding the parents
A way to select next generation
Select, Produce, Repeat!
An Evolutionary Learning Example
Knapsack Problem Given a set of items, each with
a weight and a value,
determine the number of each
Size A1 Size A2
item to include in a collection so
that the total weight is less than
Size A4 or equal to a given limit and the
Pack Volume=b
Size A3 total value is as large as
possible.
Can we find k
objects which will fit Size A5
Size A6
the pack volume b
perfectly?
String Representation
𝐿=#items
𝑆= 1 1 0 0 0 1 0 1 0 1 0 0 0
𝑆𝑖 =1 if 𝑖 𝑡ℎ item is included in the solution, 0, otherwise
Fitness Estimation
More valuable items => Better fitness
Fitness Value = Value of all items in a solution
Solution should be feasible
Fitness Value = 0 (if items don’t fit)
Exploitation vs. Exploration
Exploitation: use the best solution so far to explore further solutions
Exploration: use sub-optimal solutions to explore further solutions
Exploration: Give infeasible solutions a chance for next generation
Fitness Value of infeasible solutions = Total value of all items -
2 ×value of extra items
Selecting Parents for the “Mating Pool”
Exploitation: select the “fittest” solutions
Exploration: allow some sub optimal solutions
1. Tournament Selection
2. Truncation Selection
3. Fitness Proportional Selection
Tournament Selection
Way of selecting one parent (individual) at a time
Choose k (the tournament size) individuals from the population at random
Choose the best individual from the tournament with probability 𝑝
Choose the second best individual with probability 𝑝 × (1 − 𝑝)
2
Choose the third best individual with probability 𝑝 × ( 1 − 𝑝 )
…and so on (till one parent is selected)
Deterministic version: 𝑝 = 1
Truncation Selection
𝑀𝑃 = Pick 𝑓 fraction of the best strings in the mating pool
1
Mating pool = × 𝑀𝑃
𝑓
So that mating pool size = Initial population size
Randomly shuffle the pool and make pairs
Easy to implement but biased towards exploitation
Fitness Proportional Selection
Select (with replacement) a string (𝛼) probabilistically in proportion to
its fitness
𝛼
𝐹
𝑝𝛼 =
σ𝛼′ 𝐹 𝛼′
If 𝐹 can be –ve
𝛼
𝛼
exp(𝑠𝐹 )
𝑝 =
σ𝛼′ exp(𝑠𝐹 𝛼′ )
𝑠: selection strength
higher 𝑠 gives higher (or less –ve) 𝐹 𝛼 a higher probability
Roulette Wheel Selection
Generating Offspring
Genetic Operators
CrossOver
Mutation
Crossover
Single Point Multi Point Random
Global exploration
Offspring are radically different than their parents
Mutation
1
Flip a bit with a low probability 𝑝 =
𝐿
Choosing Next Generation
Choosing only off spring can be risky
New generation can have lesser fitness values
We can lose a really good sample
Elitism
Selecta few fittest strings from parents, say 𝑋
Replace strings from offspring with 𝑋
◼ Replacement at random, or,
◼ Replace the least fit
Tournament
Two fittest parent and their offspring
Tournament winners proceed to the next round
Elitism and Tournament can lead to premature convergence
Both promote fitter members
After some time, same set of fittest members keep getting promoted
Exploration is downplayed
Solution:
Niching (or “using island populations”)
Fitness sharing
Niching
Separate populations into subpopulations
Each subpopulation converges independently to different local minima
A few members of one sub-population are randomly injected as
“immigrants” to another.
Fitness Sharing
𝛼
𝐹
𝐹𝛼 =
#Times 𝛼 appears in a population
Biased for uncommon strings
But, can loose very good common strings
Modelling a problem as a GA
A method for representing solutions as chromosomes (or string of
characters)
A way to calculate the fitness of a solution
Exploration vs. exploitation
A selection method to choose parents One
Tournament, Truncation, Fitness Proportional selection generation
A way to generate offspring by breeding the parents
Crossover and mutation
A way to select next generation
Elitism, Tournament, Niching, Fitness sharing
Select, Produce, Repeat! => Until stopping criteria is met
Four Color Theorem: Map Colouring Problem
No more than four colors are required to color the regions of the map so
that no two adjacent regions have the same color.
We will formulate three-Color-Problem using GA
Image Credit: https://brilliant.org/wiki/four-color-theorem/
Encoding Solutions
Three colours: {black (𝑏), dark (𝑑), light(𝑙)}
We assign a fixed order to region
For a six-region map, a solution looks like
𝛼 = {𝑏𝑑𝑏𝑙𝑏𝑏}
Fitness Function
A negative point for every two adjacent regions having the same color
Fitness value can be negative
exp(𝑠𝐹 𝛼 )
σ𝛼′ exp(𝑠𝐹 𝛼′ )
Or,
𝐹 𝛼 = #Toal Boundaries − #Adjacent regions having same color
No negative fitness value
Genetic Operators
Mutation
Crossover
Limitations of GA
Can get stuck to a local minima for a very long time
Like a black box:
We don’t know how error landscape looks like and how it is working
No guarantee to converge
Training Neural Networks with GA
Encode weights as strings
Fitness function: sum-of-square errors
Reasonably good results.
Problems:
Local error information at an o/p node is lost – the entire error is clubbed
into one number
Not using gradient information
Types of Encoding in GA
Binary Encoding: a string of 0s and 1s
E.g.: Knapsack problem
Permutation Encoding: string of numbers representing a sequence
E.g.: Sorting a sequence of numbers
4 2 5 9 0 7 8 1 3 6
How to do crossover?
4 2 5 9 0 7 8 1 3 6
+
8 1 9 4 2 5 6 3 0 7
=
4 2 5 9 0 8 1 6 3 7
How to mutate?
Value Encoding: a string of real numbers
E.g.: Training weights in neural networks
Tree Encoding: Each chromosome is a tree
E.g.:Given input and output values, find a function, which will give the best
(closest to wanted) output to all inputs.
(+ x (/ 5 y))