Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
42 views27 pages

Genetic Algorithms for Researchers

The document describes how genetic algorithms can be used for variable selection. It provides details on how genetic algorithms work, including initialization of a population, evaluation and selection, crossover and mutation operations, and evolution over multiple generations. An example is given for using a genetic algorithm to select an optimal protein signature from mass spectrometry data by encoding possible signatures in binary strings and evaluating their fitness for classification.

Uploaded by

Vaishali Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views27 pages

Genetic Algorithms for Researchers

The document describes how genetic algorithms can be used for variable selection. It provides details on how genetic algorithms work, including initialization of a population, evaluation and selection, crossover and mutation operations, and evolution over multiple generations. An example is given for using a genetic algorithm to select an optimal protein signature from mass spectrometry data by encoding possible signatures in binary strings and evaluating their fitness for classification.

Uploaded by

Vaishali Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

Genetic Algorithm for Variable Selection

Jennifer Pittman ISDS Duke University

Genetic Algorithms Step by Step


Jennifer Pittman ISDS Duke University

Example: Protein Signature Selection in Mass Spectrometry

http://www.uni-mainz.de/~frosc

relati$e intensity

/f!g"po#.html

molecular weight

%enetic &lgorithm '(olland) * heuristic method !ased on + sur$i$al of the fittest , * useful when search space $ery large or too complex
for analytic treatment

* in each iteration 'generation) possi!le solutions or


indi$iduals represented as strings of num!ers
# .- # /0 #.1 - - --- - ------ -- - - - - - --- - ----

--

- - - --

-- - -

3 http://www.spectroscopynow.com

* all indi$iduals in population


e$aluated !y fitness function

* indi$iduals allowed to
reproduce 'selection)4 crosso$er4 mutate

2lowchart of %&

http://i!-poland.$irtuala$e.net/ee/genetic-/#geneticalgorithms.htm

'a simplified example)

5nitialization * proteins corresponding to ./6 mass spectrometry


$alues from # -#.// m/z

* assume optimal signature contains # peptides


represented !y their m/z $alues in !inary encoding

* population size ~M78/. where 8 is signature length

- - -

--- - ----

5nitial Population
M 7 -.
--

- - --- - ------ -- - - - - - --- - ----

- - - --

-- - -

8 7 .1

Searching * search space defined !y all possi!le encodings of


solutions

* selection4 crosso$er4 and mutation perform


+pseudo-random, wal9 through search space

* operations are non-deterministic yet directed

Phenotype :istri!ution

http://www.ifs.tuwien.ac.at/~aschatt/info/ga/genetic.html

E$aluation and Selection * e$aluate fitness of each solution in current


population 'e.g.4 a!ility to classify/discriminate) ;in$ol$es genotype-phenotype decoding<

* selection of indi$iduals for sur$i$al !ased on


pro!a!ilistic function of fitness

* on a$erage mean fitness of indi$iduals increases * may include elitist step to ensure sur$i$al of
fittest indi$idual

=oulette >heel Selection


3http://www.softchitech.com/ec"intro"html

?rosso$er * com!ine two indi$iduals to create new indi$iduals


for possi!le inclusion in next generation

* main operator for local search 'loo9ing close to


existing solutions)

* perform each crosso$er with pro!a!ility pc * crosso$er points selected at random

@ ./4A4 .0B

* indi$iduals not crossed carried o$er in population

5nitial Strings Single-Point


-- - - -- ---- - - -----

Cffspring

- - - -- ---

- ----- - -

Dwo-Point
-- - - -- ---- - - ----- - ----- -- - ----

Eniform
-- - - -- ---- - - ---- - - -------- - -- -

Mutation * each component of e$ery indi$idual is modified with


pro!a!ility pm

* main operator for glo!al search 'loo9ing at new


areas of the search space)

* pm usually small @

-4A4 . -B

rule of thum! 7 -/no. of !its in chromosome

* indi$iduals not mutated carried o$er in population

3http://www.softchitech.com/ec"intro"html

phenotype
# .- # /0 #.1 # -F # /G #-6/ # #6 #-0/ #-. #-GF # 00 #- 6 --

genotype
- - --- - ------ -- - - --- ----- - - -

fitness
.6F ..# .1/ .G1

- - - --

# 1 .

# 1 1 ---

- - -

--- - ---- --- ----- - -- - -

- - - -- - - --

selection

one-point crosso$er 'p7 .6)


.# .0 --- - --- - ---- --- ----- - -- - --- - --- ----- - -- - - --- - ----

- - - -- - - --

- - - -- - - --

mutation 'p7 . /)
- - ----- ---- --- - -- - - - -- ---- ----- - - - -

- --- - ----

-- - --- - ----

- - - -- - - --

-- - - - - --

starting generation
# .- # /0 #.1 # -F # /G #-6/ # #6 #-0/ #-. #-GF # 00 #- 6 -- - --- - ------ -- - - --- ----- - - .6F ..# .1/ .G1

- - - --

next generation
- - - --- ---- ----- - - - # .- # 1G #-.. #-66 #-01 #.1 #-GF #-. #- 6 #.-# # 00 # 1. .0.FF .1. .G0

-- - --- - ----

-- - - - - --

genotype

phenotype

fitness

%& E$olution

&ccuracy in Percent

-.

%enerations
http://www.sdsc.edu/s9idl/proHects/!io-SI5:8/

genetic algorithm learning

2itness criteria

-F

-6

-/

-1

%enerations

-/

http://www.demon.co.u9/apl#0//aplG6/s9om.htm

) de l acs' eu l a$ ss enti 2

iteration

* (olland4 J.

References

'-GG.)4 &daptation in natural and artificial systems 4 .nd Ed. ?am!ridge: M5D Press.

* :a$is4 8. 'Ed.) '-GG-)4 (and!oo9 of genetic algorithms.


Kew Lor9: Man Kostrand =einhold.

* %old!erg4 :. '-G0G)4 %enetic algorithms in search4


new philosophy of machine intelligence. Piscataway: 5EEE Press.

optimization and machine learning. &ddison->esley.

* 2ogel4 :. '-GG/)4 E$olutionary computation: Dowards a * NOc94 D.4 (ammel4 E.4 and Schwefel4 (. '-GGF)4

+E$olutionary computation: ?omments on the history and the current state,4 5EEE Drans. Cn E$ol. ?omp. -4 '-)

nline Resources

* http://www.spectroscopynow.com
/index.htm

* http://www.cs.!ris.ac.u9/~colin/e$ollect-/e$ollect * 5lli%&8 * %&li!

'http://www-illigal.ge.uiuc.edu/index.php#)

'http://lancet.mit.edu/ga/)

or p m i t necr eP

iteration

Schema and %&s * a schema is template representing set of !it strings


-PPP@--4 -- -4 - --4 -----4 A B

* e$ery schema s has an estimated a$erage fitness f's):


EtQ- 9 ;f's)/f'pop)< Et

* schema s recei$es exponentially increasing or decreasing


num!ers depending upon ratio f's)/f'pop)

* a!o$e a$erage schemas tend to spread through


population while !elow a$erage schema disappear 'simultaneously for all schema R +implicit parallelism,)

!A"DI#$ %

3www.protagen.de/pics/main/maldi..html

You might also like