Basics of Genetic Algorithms and Some Possibilities: Peter Spijker
Basics of Genetic Algorithms and Some Possibilities: Peter Spijker
Basics of Genetic Algorithms and Some Possibilities: Peter Spijker
Peter Spijker
Technische Universiteit Eindhoven Department of Biomedical Engineering Division of Biomedical Imaging and Modeling
13
California Institute of Technology Materials Process and Simulation Center Biochemistry & Molecular Biophysics
Presentation Overview Purpose of presentation General introduction to Genetic Algorithms (GAs) Biological background
Origin of species Natural selection
13
Genetic Algorithm
Search space Basic algorithm Coding
Methods
Examples
Possibilities
Purpose of presentation Optimising parameters of force fields is a difficult and time consuming task Use of optimising methods might be of use Methods:
- steepest descent - simulated annealing (Monte Carlo)
13
- genetic algorithms
13
Genetic algorithms (GAs) are a technique to solve problems which need optimization GAs are a subclass of Evolutionary Computing GAs are based on Darwins theory of evolution
History of GAs Evolutionary computing evolved in the 1960s. GAs were created by John Holland in the mid-70s.
Biological Background (1) The cell Every animal cell is a complex of many small factories working together The center of this all is the cell nucleus The nucleus contains the genetic information
13
13
Genetic information is stored in the chromosomes Each chromosome is build of DNA Chromosomes in humans form pairs There are 23 pairs The chromosome is divided in parts: genes
13
The entire combination of genes is called genotype A genotype develops to a phenotype Alleles can be either dominant or recessive Dominant alleles will always express from the genotype to the fenotype Recessive alleles can survive in the population for many generations, without being expressed.
13
Mitosis is copying the same genetic information to new offspring: there is no exchange of information
Biological Background (5) Reproduction Meiosis is the basis of sexual reproduction After meiotic division 2 gametes appear in the process In reproduction two gametes conjugate to a zygote wich will become the new individual Hence genetic information is shared between the parents in order to create new offspring
13
Biological Background (6) Reproduction During reproduction errors occur Due to these errors genetic variation exists Most important errors are: Recombination (cross-over) Mutation
13
13
The origin of species: Preservation of favourable variations and rejection of unfavourable variations. There are more individuals born than can survive, so there is a continuous struggle for life. Individuals with an advantage have a greater chance for survive: survival of the fittest.
Biological Background (8) Natural selection Important aspects in natural selection are: adaptation to the environment
13
isolation of populations in different groups which cannot mutually mate If small changes in the genotypes of individuals are expressed easily, especially in small populations, we speak of genetic drift Mathematical expresses as fitness: success in life
Presentation Overview Purpose of presentation General introduction to Genetic Algorithms (GAs) Biological background
Origin of species Natural selection
13
Genetic Algorithm
Search space Basic algorithm Coding
Methods
Examples
Possibilities
Genetic Algorithm (1) Search space Most often one is looking for the best solution in a specific subset of solutions
13
This subset is called the search space (or state space) Every point in the search space is a possible solution Therefore every point has a fitness value, depending on the problem definition GAs are used to search the search space for the best solution, e.g. a minimum
2.5 2
1.5
Difficulties are the local minima and the starting point of the search
0.5
100
200
300
400
500
600
700
800
900
1000
13
Starting with a subset of n randomly chosen solutions from the search space (i.e. chromosomes). This is the population This population is used to produce a next generation of individuals by reproduction Individuals with a higher fitness have more chance to reproduce (i.e. natural selection)
13
FITNESS : Evaluate fitness f(x) of each chromosome in the population NEW POPULATION 0 SELECTION : Based on f(x)
1 RECOMBINATION : Cross-over chromosomes 2 MUTATION 3 ACCEPTATION 3 : Mutate chromosomes : Reject or accept new one
generation
4 5 TEST LOOP : Test problem criterium : Continue step 1 4 until criterium is satisfied
13
Normal cells are diploid (containing 2 complete sets of chromosomes) On the contrary gametes are haploid Formalizing diploid reproduction is much more difficult than haploid Diploid populations have an extra dimension compared to haploid populations For simplicity therefore only haploid genetic algorithms
13
Every bitstring therefore is a solution but not necisseraly the best solution The way bitstrings can code differs from problem to problem
1 0 0 1
Genetic Algorithm (6) Coding Recombination (cross-over) can when using bitstrings schematically be represented:
13
1 0 0 1 1 0 1
0 1 0 1 1 1 0
1 0 0 1 1 1 0
0 1 0 1 1 0 1
13
Mutation prevents the algorithm to be trapped in a local minimum In the bitstring approach mutation is simpy the flipping of one of the bits
1 0 0 1 1 0 1
1 1 0 1 1 0 1
13
Both recombination and mutation depend a lot on the exact definition of the problem and the choice of representing the chromosomes (e.g. no bitstrings) Different encodings can be used:
Binary encoding
Permutation encoding
Value encoding Tree encoding
Example Minimum of Function (1) First example shows how to find the minimum of a function
2.5
13
1.5
0.5
1100101001
13
1.8
1.6
1.4
1.2
Mean fitness
1.3
Best fitness
0.8
0.6
1.1 1200 1
Individual
Best individual
Fitness
0.9
0.8
0.7
0.6
0.5
0.4
0.3
10
20
30
Generations
40 Generations
50
60
70
80
Example Minimum of Function (3) Interactive show of this algorithm with Matlab Using the function: genalg2() Variables:
Population size
Bitstringlength Mutation chance Recombination chance Starting population adaption
13
13
It is clear from the example that the convergence speed of the algorithm depends on many factors:
Population size
Mutation probability
Recombination probability Elitism
Selection methods
Random selection of parents Roulette wheel selection of parents
Strong point GAs: mutation prevents from falling in a local minimum, recombination initiates a fast first convergence
Example Checkboard (1) We are given an n by n checkboard in which every field can have a different colour from a set of four colours.
13
Goal is to achieve a checkboard in a way that there are no neighbours with the same colour (not diagonal)
10 1 2 3 4 5 6 7 8 9 10
10 1 2 3 4 5 6 7 8 9 10
Example Checkboard (2) Chromosomes represent the way the checkboard is coloured. Chromosomes are not represented by bitstrings but by bitmatrices The bits in the bitmatrix can have one of the four values 0, 1, 2 or 3, depending on the colour
13
Crossing-over involves matrix manipulation instead of point wise operating. Crossing-over can be combining the parential matrices in a horizontal, vertical, triangular or square way
Mutation remains bitwise changing bits in either one of the other numbers
13
170
165
160
Fitness
155
150
145
140
135
130
100
200
300 Generations
400
500
600
This problem can be seen as a graph with n nodes and (n-1) edges, so the fitness f(x) is easily defined as: f(x) = 2 (n-1) n
13
Fitness
100
200
300
400
500
200
400
600
800
Horizontal Cutting Crossing Over 180 170 180 170 160 150 140 130
Fitness
200
400 Generations
600
800
1500
Example Checkboard (5) Interactive show of this algorithm with Matlab Using the functions:
main() checkers() bestindividual()
13
mutate()
recombine() select() showbestindividual()
Possibilities Using the genetic algorithm to optimise parameters for a force field Parameters are real numbers, so adaptations of these algorithms is required Value incoding vs. bitstring encoding Difficulties:
Definition fitness function Integration algorithm with software
13
Further Questions
13